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Preface 


This second edition, entitled Logic, Mathematics, and Computer Science: Modern 
Foundations with Practical Applications, has been adapted from Foundations of 
Logic and Mathematics: Applications to Computer Science and Cryptography, © 
2002 by Birkhauser, from which Chapters 1-5 have been retained but extensively 
revised. Chapters 6 and 7 have been added. 

This text discusses the foundations where logic, mathematics, and computer 
science begin. The intended readership consists of undergraduate students majoring 
in mathematics or computer science who must learn such foundations either for their 
own interest or for further studies. For a motivated reader, there are no technical 
prerequisites: you need not know any technical subject to start reading this text. 

Although the text does not focus on the history and philosophy of the founda- 
tions, the material cites copious references to the literature, where the reader may 
find additional historical context. Consulting such references is neither suggested 
nor necessary to study the theory or to work on the exercises, but individual citations 
document the material by original sources, and all the citations together provide a 
guide to the variations and chronological developments of logic, mathematics, and 
computer science. For example, Chapter | traces the origin of Truth tables to Charles 
Sanders Peirce’s unpublished 1909 Logic Notebook on philosophy and points out 
their applications over one half of a century later to the design of computers for use 
on Earth and on board the Apollo lunar spacecraft. 

Along informal arguments, this text also shows the corresponding purely sym- 
bolic manipulations of formulae, because they clarify the reasoning [11] and can 
reveal hitherto hidden logical properties, such as the mutual independence of 
different patterns of reasoning, or the impossibility of some proofs within some 
logics: 

As for algebra [of logic], the very idea of the art is that it presents formulae which can be 


manipulated, and that by observing the effects of such manipulation we find properties not 
to be otherwise discerned (Charles Sanders Peirce [104, p. 182]). 


If professionals are unable to learn some topics by any means other than the pure 
manipulation of symbols, then it would seem unfair to claim that all learning must be 
intuitive and hide from students such purely manipulative but successful methods. 
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The selection of topics also reflects major accomplishments from the twentieth 
century: the foundation of all of mathematics, and later computer science, as well 
as computer-assisted proofs of mathematical theorems, on a formal logic based on 
only a few axioms, transformation rules, and postulates for set theory [47, 50, 54, 
105, 139]. Also, while not written in formal logic, Nobel-Prize winning applications 
to the social sciences rely on the same foundations, as shown in Chapter 7. 

To introduce the foundations of logic, the provability theorem in Chapter 1 
provides an algorithm to design proofs in propositional logic. Chapter 1 also 
explains the concept of undecidability with multi-valued (“fuzzy”) logic and 
presents a proof of unprovability. Chapter 2 introduces logical quantifiers. 
A working knowledge of logical quantifiers is crucial for the study of basic concepts 
in modern mathematical analysis and topology, such as the uniform convergence 
of a sequence of equicontinuous functions. Continuing with the foundations of 
mathematics, Chapter 3 presents a version of the Zermelo—Fraenkel set theory. At 
the juncture of mathematics and computer science, Chapter 4 develops the concepts 
of definition and proof by induction. Chapter 4 then uses induction with set theory 
to define the integers and rational numbers and derive the associative, commutative, 
and distributive laws, as well as algorithms, for their arithmetics. To give readers 
some idea of topics at an intermediate level, Chapter 5 shows that in a well-formed 
theory some paradoxes do not occur, while Chapter 6 completes the foundations of 
set theory with the axiom of choice. 

No extragalactic asteroid has yet been found with the universal laws of logic 
engraved in it. Consequently, not just one logic but many different logics have 
been invented. Different logics lead to different mathematics and different computer 
sciences. However, the acid test for adopting a particular logic is its ability 
to make predictions that are born by subsequent experiments. Formal logic is 
thus a mathematical model of rational thought processes. In this aspect, logic, 
mathematics, and computer science are experimental sciences. Only one logic has 
passed all such tests, which is the one used throughout this text. Other logics 
are outlined in Chapter | as a pedagogical device and to show some of their 
shortcomings. 
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Chapter 1 
Propositional Logic: Proofs from Axioms 
and Inference Rules 


1.1. Introduction 


This chapter introduces propositional logics, which consist of starting formulae 
called axioms and rules of inference to derive from the axioms other formulae called 
theorems. Axioms and rules of inference form a mathematical model of rational 
thinking processes; theorems are their consequences. Different such logics, which 
are also called calculi, rely on different axioms or different rules of inference. 
For example, the Pure Positive Implicational Propositional Calculus focuses only 
on the logical implication. The first few sections derive some of its theorems, for 
instance, the transitivity of the logical implication and the law of commutation, 
using the rule of Detachment with the laws of affirmation of the consequent and 
self-distributivity of the logical implication taken as axiom schemata from Frege and 
Lukasiewicz. A preliminary version of the Deduction Theorem for the propositional 
calculus provides a method for designing proofs. Another section shows the mutual 
equivalence of these axioms with Kleene’s axioms and Tarski’s axioms. Adding 
the converse law of contraposition, subsequent sections focus on the Classical 
Propositional Calculus, deriving the laws of double negation, reductio ad absurdum, 
proofs by contradiction, and proofs by cases. Yet another section outlines the 
equivalence of Frege and Lukasiewicz’s axioms with Church’s, Kleene’s, Tarski’s, 
and Rosser’s axioms, respectively. A final section demonstrates the contrast between 
logics that admit of a recipe for constructing proofs of all “valid” formulae, and 
logics where some formulae are “valid” but unprovable. The prerequisite for this 
chapter is the ability to read, compare, and substitute sequences and tables of 
symbols. The goal of this chapter is merely to develop a working knowledge of 
propositional logic: 
Young man, in mathematics you don’t understand things, you just get used to them. — John 
von Neumann, cited by Felix Smith, cited by Gary Zukav [148, p. 226]. 
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Logic, mathematics, and computations can be traced through several millennia 
to ancient civilizations in Babylonia, China, and India. Documents attributed to 
them show methods to calculate such items as taxes, the dimensions of altars, 
and the dates of future solstices or eclipses. More complicated problems arose, for 
example, the determination of the shapes and sizes of the Earth and the Moon, or the 
distances from Earth to the Moon and the Sun. (For a survey of these ancient records, 
consult, for instance, the texts by Dreyer [27], Evans [32], Neugebauer [98-100], 
Van Brummelen [133], and van der Waerden [134].) The solutions to such problems 
require methods more sophisticated than mere calculation, and hence arises the need 
for a study of logic itself, which can be traced to the Greece of a couple of millennia 
ago. This study of logic continues: the ambiguity of the classical verbal exposition 
of logic and the need for unambiguity in complex situations led to algebraic and 
symbolic treatments of logic, for instance, the Truth tables presented here. The point 
of logic is not only “Truth” but also “relevance” to its users [24, p. 6]. Yet relevance 
is subjective. Therefore, the following subsection presents an example that is not 
only claimed but documented to be relevant in the real lives of real people. 


1.1.1 An Example Demonstrating the Use of Logic in Real Life 


For the purpose of an introduction, the following example demonstrates how logic 
can help in resolving practical issues in real life, how questions arise about the 
validity of logical methods to reach conclusions, and eventually what thought 
processes are acceptable or successful in explanations and predictions. Yet an 
understanding of this example will not be necessary for any of the subsequent 
material. 


1.1 Example. The planetary status of Pluto has been debated in newspapers: 


Is Pluto really a planet? Like all civil wars, this has even split families apart [67]. 


The public’s interest in Pluto’s planet-hood is sufficient to devote an entire book 
explaining the question to children [73]. Textbooks have classified Pluto as a planet 
since its discovery in 1930 [69, p. 213], but the question remains whether Pluto 
might rather be an asteroid [75]. Various answers rely on various definitions and on 
logic [101]. For instance, one definition states that planets are bigger than moons. 
This definition can be stated in terms of a hypothesis (abbreviated by H), 


hypothesis H: a celestial object P is a planet, 
a conclusion (abbreviated by C), 

conclusion C: the celestial object P must be bigger than every moon, 
and a logical implication of the form “if H, then C”: 


If a celestial object P is a planet (if H), 
then the celestial object P must be bigger than every moon (then C). 
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The logical implication “if H, then C” (also worded as “H implies C’’) is true by this 
definition of planets. If the hypothesis H is also true, then the conclusion C is true, 
too: C can be “detached from the implication” [80, p. 124]. 

With a logical implication (“if H, then C”) there are two other useful statements: 
its converse (“if C, then H”) and its contraposition (“if not C, then not H’”). 

The hypothesis H can be tested by the contraposition “if not C, then not H”: 


If a celestial object P is not bigger than every moon (if not C), 
then P is not a planet (then not H). 


In 1978 measurements revealed that Pluto was smaller than the Moon [69, p. 213]; 
consequently, Pluto would no longer be a planet, by the foregoing definition. 

The definition “if H, then C” can also be tested in practice. For instance, 
textbooks classify Mercury as a planet, but they also classify Ganymede as a moon 
(of Jupiter), even though Mercury is smaller than Ganymede [69, p. 182 & 203]. 
Thus the statement “if Mercury is a planet, then Mercury is bigger than every moon” 
is false. Therefore the foregoing definition “if H, then C” is false, and Pluto can 
remain a planet. Logic has thus resolved the issue by revealing that the question 
pertains not to the status of Pluto but to the definition of planets. 

Other definitions and logical arguments have also been debated [73, 101]. Very 
shortly thereafter, all existing definitions of planets were again put into question: 


Scientifically, we are unable to define a planet in a sufficiently meaningful way such as to 
include Pluto without including many other objects [...] we are also unable to develop 
a definition based on principles of astronomy and physics that excludes Pluto in any 
nonarbitrary way [137, p. 219, summarizing chapter 14, p. 185-221]. 


The definition of planet will not be settled here, but some patterns of logical 
reasoning will. For instance, the preceding arguments show that detachment and 
contraposition are valid popular and scientific modes of reasoning; converse is not. 

The preceding discussion contains one logical principle that has been so suc- 
cessful that it remains widely accepted in theory and in practice: the law of 
contraposition, which states that if an implication “if H, then C” holds, then its 
contraposition “if not C, then not H” also holds. 

The converse law of contraposition — which states that if the contraposition 
“af not C, then not H” holds, then the implication “if H, then C” also holds — was 
not used in the preceding discussion, and it is not a part of some logical systems. 

In this example, the converse statement “if C, then H” (also worded as “C implies 
H”) is false, because the Sun is bigger than every moon (C is true), but the Sun is 
not a planet (H is false). Nevertheless, implications of the form “if H, then C” and 
their converse “if C, then H” have been confused by professional scientists, so that 
the difference between an implication and its converse bears being emphasized: 


Leontovich explained to me why the paper could not be published in ZETP [“the main 
Russian physics journal”): 


the paper claims that “A implies B” while every physicist knows examples showing 
that B does not imply A; [...] 
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An author, claiming that A implies B, must say whether the converse holds, otherwise the 
reader who is not spoiled by the mathematical slang would interpret the claim as “A is 
equivalent to B.” If mathematicians do not follow this rule, they are wrong [3, p. 619-620]. 


Such confusion among world-class scientists shows the necessity of specifying 
patterns of rational thought processes exactly, for instance, as done in this chapter. 


1.2 Remark (difficulties with real examples). Some examples of uses of logic may 
enhance the effectiveness of the exposition. Such examples might consist of English 
sentences, for example, “Pluto is a planet.” However, difficulties arise in determining 
whether and why such a practical sentence is true or false. Indeed, a statement 
as simple as “Pluto is a planet” can immediately be challenged to no end, as 
demonstrated in example 1.1. Moreover, the word “planet” comes from the Greek 
word for “wanderer” and therefore in antiquity the Sun was also considered a planet 
[133, p.3]. Thus the argument about such an elementary practical logical issue as 
Pluto’s planetary status really has no ends in any direction, without any mention of 
other more complex practical questions. Therefore, to focus on logic, mathematics, 
and computing, instead of endlessly debatable issues, the following discussion will 
also use “toy” examples with truth or falsity decided in advance. 


1.2. The Pure Propositional Calculus 


Propositional logic focuses on the derivation of conclusions from hypotheses, by 
means of rules of inference and initial hypotheses called axioms that state patterns of 
rational thinking precisely. Different axioms or different rules of inference may lead 
to different logics or to mutually equivalent logics, for instance, the Pure Positive 
Implicational Propositional Logic, which is a part of the full Classical Propositional 
Logic. 

Different readers may prefer prose or formulae. Jan Lukasiewicz stated concisely 
the disadvantage of prose in formulating and solving logical problems: 


Alles zerflieSt in vagen philosophischen Spekulationen [80, p. 125] 
(“everything melts into vague philosophical speculations”). 


In contrast, formulae provide greater precision in complex situations [11]. To this 
end, the Pure Propositional Calculus (also called Sentential Calculus) presented 
here can be traced to Gottlob Frege [38, 39]. The adjective “pure” means that the 
simplest (‘atomic’) formulae are letters or symbols that do not denote anything: 


mathematical logic is a meaningless game with symbols [108, p. xi]. 


Yet such symbols may later denote various types of atomic formulae that apply 
to such various fields as algebra, arithmetic, geometry, or set theory, and therein 
lies the power of the pure propositional calculus. Furthermore, purely symbolic 
manipulations of formulae can reveal hitherto hidden logical properties: 
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As for algebra [of logic], the very idea of the art is that it presents formulae which can be 
manipulated, and that by observing the effects of such manipulation we find properties not 
to be otherwise discerned (Charles Sanders Peirce, [104, p. 182]). 


If famous logicians find manipulations of formulae indispensable in clarifying logic, 
then such manipulations might also help nonspecialists in studying logic. 


1.2.1 Formulae, Axioms, Inference Rules, and Proofs 


Common instances of logical reasoning consist of sentences. For example, the first 
axiom of Euclidean geometry is a sentence (paraphrased from [61, p. 3]): 


For each pair of distinct points there exists exactly one line passing through both points. 


Similarly, propositional logic starts with certain logical formulae called axioms. 
The word “axiom” can mean “self-evident truth” but in the present context, which 
focuses on patterns of reasoning, the word “axiom” means “initial” or “starting” 
logical pattern [110, p.55]. Different selections of axioms can lead to different 
kinds of logic, but the present chapter focuses mainly on classical logic, which 
has been successful for several millennia. Several choices for the initial axioms and 
formulae lead to the same classical logic. Because the principal concepts of logic 
consist of “negation” and “implication” several common choices of initial axioms 
and formulae involve only the connectives for negation (—) and implication (=>). 
Also, to allow for applications in various areas, the pure propositional calculus 
replaces the “atomic formulae” from algebra, arithmetic, geometry, or set theory 
by general symbols called propositional variables or sentence symbols. 


1.3 Definition (Well-formed formulae). Select two disjoint lists of symbols. 

Every symbol from the first list of symbols, which may consist of one or more 
letter(s) from a specified alphabet, P, Q, ..., optionally with subscript(s) P,, Ppp, ..., 
superscript(s) P*, P™, ..., or “middlescript(s)” P|, P||, ..., is called a formulaic 
letter. Such formulaic letters are not parts of the propositional calculus, but they 
help in describing the following rules to define well-formed formulae. 

Also, every symbol from the second list of symbols, which may consist of one 
or more letter(s) from a specified alphabet, A, B, ..., optionally with subscript(s) 
Ay, App, -.-, Superscript(s) A*, A®, ..., or “middlescript(s)” Al, Al], ..., is called a 
propositional variable or a sentence symbol [31, p. 17]. (Propositional variables 
may later be replaced by atomic formulae specific to applications.) 

Every propositional variable is a well-formed formula. For all well-formed 
formulae P and Q, the following strings of symbols are also well-formed formulae: 


(W1) -(P) — (read “not P”), 
(W2) (P) => (Q) (read “P implies Q” or “if P, then Q”’). 
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Furthermore, only strings of symbols built from letters or variables through 
applications of the rules W1 and W2 can be well-formed formulae. Equivalent 
definitions apply to other connectives and to prefix and postfix notations. 


The use of parentheses in definition 1.3 conforms to [22, p. 7]. and [114, p. 185]. 
Parentheses without rules of precedence reflect the motto 


more parentheses but less memorizing [114, p. 216]. 


By definition 1.3, a string of symbols such as (P) = (Q) is not yet a well- 
formed formula of the propositional calculus. It only becomes so after P and Q 
have been replaced by propositional variables or well-formed formulae, for instance, 
(A) = (B). In this section, any capital letter may denote a propositional letter, 
variable, or formula. In subsequent sections, however, the distinction may matter. 

From definition 1.3, propositional letters or variables are “atoms” or “atomic” 
propositional formulae in the sense that they are the simplest well-formed proposi- 
tional formulae [72, p. 5], in contrast to more elaborate “composite” or “compound” 
formulae, also called propositional forms, built from rules W1 and W2. 

Several choices of well-formed propositional formulae can serve as axioms. 
A system that remains concise and differentiates the roles of separate connectives 
consists of the following three axioms. Subsection 1.3.10 explains their popularity 
[18, §20, p. 119], [84, p. 49], [81, p. 31], [85, p. 165], [122, p. 165]. 


1.4 Definition (Jan Lukasiewicz’s axioms). A logical formula is an axiom of the 
version of the classical propositional calculus considered here if and only if it is one 
of the following three formulae, attributed to Lukasiewicz [62, p. 29], where P, Q, 
and R may be any well-formed propositional formulae. The first two axioms are also 
all the axioms of the Pure Positive Implicational Propositional Calculus: 


Axiom P1 (P) > [(Q) > (P)]. 
Axiom P2 {(P) > [(Q) > (R)]} > t[) > (Q)] = [(P) = (R)]}- 
Axiom P3 {[—(Q)] > [-(P)]} > [(P) > (Q)]. 


The first two axioms are also in Frege’s work [38], [39, p. 137, eq. (1) & (2)], 
where they reflect a common mathematical model of rational thinking: 


e Axiom Pl, (P) > [(Q) => (P)], is called the law of affirmation of the 
consequent. In Frege’s (translated) words, axiom P1 states that 


If a proposition [P] holds, it also holds in case an arbitrary proposition [Q] holds [39, 
p. 137]. 


+ Axiom P2, {(P) > [(Q) > (R)]} > {[(P) > (Q)] > [(P) > (R)]}, is called 
the law of self-distributivity of implication. In Frege’s words, axiom P2 states 
that 


If a proposition [R] is the necessary consequence of two propositions ({Q] and [P]), 
that is, if [((P) = {(Q) => (R)}}, and if the first term [Q] is again the necessary 
consequence of the other [P], then the proposition [R] is the necessary consequence of 
the last proposition [P] alone [39, p. 139]. 
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¢ Axiom P3, {[=(Q)] > [-(P)]} => [(P) => (Q)], called the converse law of 
contraposition [18, §20, p. 119], states that a contraposition, [-(Q)] => [-(P)] 
suffices to establish a classical logical implication (P) > (Q). 


The converse law of contraposition distinguishes classical logic from several 
other systems of logic, for instance, Hilbert’s Positive Propositional Calculus, 
Brouwer & Heyting’s Intuitionistic Logic, and Kolmogorov & Johansson’s Minimal 
Logic [18, §26, p. 140-146]. Still, all these logical systems include Lukasiewicz’s 
first two axioms, PI and P2. 

The propositional calculus includes the following concepts of theorem and proof. 


1.5 Definition. A well-formed propositional form is a theorem of a propositional 
calculus if and only if it is obtained by the following rules of inference: 


1.6 Rule (Axioms). 
Every axiom (of a logic) is a theorem (of the same logic). 


1.7 Rule (“Modus Ponens”’ (abbreviated by M. P.), or “Detachment”’). 


For all propositional forms H and C, 
if H is a theorem and 

if (H) > (C) is a theorem, 

then C is a theorem. 


The name of this rule will be printed here as “Detachment” to avoid unintended 
awkward sentences. With Detachment, H is the minor premiss while (H) > (C) 
is the major premiss (so spelled to distinguish its plural from “premises” [18, p. 1, 
footnote 3]). Rule 1.7 states that if a hypothesis H and an implication (H) => (C) 
hold, then the conclusion C may be “detached from the implication” (“von der 
Implikation abgetrennt” in Lukasiewicz’s language [80, p. 124]). 


1.8 Remark. Each use of the rule of Detachment requires two previously proved 
well-formed formula: a proved hypothesis H and a proved implication (H) => (C). 


Definitions 1.3 and 1.4 allow P, Q, and R to denote any propositional letters 
or well-formed propositional formulae, so that axioms P1—P3 are templates, or 
schemas, to generate axioms. Alternatively, allowing only propositional variables 
in axioms P1—P3 but introducing the substitution rule 1.9 gives equivalent axioms: 


1.9 Rule (Substitution). 


For each propositional variable K in a theorem R (which is a propositional form), 

and for each well-formed propositional form L, 

the propositional form obtained by replacing in R every occurrence of K by L is again a 
theorem. 


A proof, or deduction, of a theorem R is a finite sequence of logical formulae 
P,Q, ..., R, in which each formula is either a substitution in an axiom or in a 
previously proven formula, or results from the rule of Detachment (Modus Ponens). 


1.10 Example (Substitution). The formula (A) => [(B) => (A)] is an instance of 
axiom P1; hence it is a theorem. Substituting —(C) for A in (A) > [(B) => (A)] 
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yields [-(C)] => {(B) => [-(O)]}, which is another instance of axiom P1, and 
hence also a theorem. Because such substitutions in an axiom yield other axioms, 
each axiom is also called an axiom schema. Thus both axioms (A) => [(B) => (A)] 
and [-=(C)] > {(B) => [-(C)]} result from the axiom schema P1. 


1.11 Definition. For every logical formula R, the symbol - (called a “turnstyle” 
and read “‘yield(s)” [110, p. 57]) and the notation 


ER 


means that there exists a proof of R. An alternative notation, P1, P2, P3  R, also 
specifies the list of axioms, here P1, P2, P3, from which R is a theorem. 

More generally, for all logical formulae P and R, the notation P  R means that 
with P added to the list of axioms, there exists a proof of R. The corresponding 
alternative notation, P1, P2, P3, P + R, again specifies the list of axioms. In other 
words, R is a theorem for the logic with axioms P1, P2, P3, P. With a different 
terminology, P + R means that R is derivable from P and the axioms. 

Yet more generally, for all logical formulae P,Q,...,R, either notation 
P,Q,... & Ror Pl, P2, P3, P,Q,... / R, means that with P,Q,... added to 
the list of axioms, there exists a proof of R. In other words, R is a theorem for 
the logic with axioms P1, P2, P3, P,Q,... The formula R is then derivable from 
P,Q,..., if and only if P,Q,... 1 R. In the notation of Smullyan [117, p. 17] and 
Stolyar [123, p. 63], P,Q,... Ris also denoted by 


P,O,... 
4 


Verifying a proof reduces to checking that each step conforms to the foregoing 
definition of proof. In contrast, constructing a proof may require some creativity, 
which may involve trying some rules and some axioms in various combinations, 
some of which may fail whereas others may succeed [72, p. 55, lines 1-3], [114, 
p. 31]. For the propositional calculus presented here, there is an algorithm (a recipe) 
to design proofs, but its justification first requires most of the proofs presented here 
[123, p. 193-197]. Moreover, the algorithm is cumbersome and would generate 
proofs longer than the ones explained here. Nevertheless, the collection of all the 
proofs shown here will demonstrate the steps that the algorithm would involve. 
With such an understanding of the algorithm, a user might then automate the 
algorithm with a computer [47, 50, 54, 139]. Furthermore, the following proofs also 
provide some practice in creating proofs without using an algorithm, a practice that 
corresponds more closely to the situation in mathematics, and for which there can 
be no algorithm [46]. 
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1.3. The Pure Positive Implicational Propositional Calculus 


With the two rules of inference (Detachment and substitution), the first two axioms 
of classical propositional calculus (P1 and P2) pertain only to logical implications 
(=>); they form the Pure Positive Implicational Propositional Calculus, which is 
common to other logics [18, § 29, p. 161]. In contrast, the concept of negation (—) 
does not belong to the Positive Implicational Propositional Calculus. Axioms about 
the negation, for instance, axiom P3, differentiate classical logic from other logics. 


1.3.1 Examples of Proofs in the Implicational Calculus 


The following three theorems provide examples of proofs about or in the Pure 
Positive Implicational Propositional Calculus, which is the propositional calculus 
with implications but no negations. The first theorem is called a “derived rule” 
(of inference) because it involves a hypothesis, T, which can be any axiom or 
previous theorem. Such a derived rule of inference is a theorem about rather than 
in the implicational calculus, but it provides a recipe to shorten subsequent proofs. 
Specifically, theorem 1.12 shows a derivation of (S) = (T) from T and axiom P1. 
The proof of theorem 1.12 is also a building block of the Deduction Theorem 1.22. 


1.12 Theorem (derived rule). For each well-formed formula S and for each 
theorem T, the implication (S) = (T) is a theorem: P1, T+ (S) => (T). 


Proof. Apply axiom P1 and Detachment as follows: 


ET hypothesis (minor premiss), 
- (T) > [(S) > (1)] _ substitution in axiom P1 (major premiss), 
F (S) => (T) Detachment and preceding two formulae. 
Thus (S) = (T) is a theorem derivable from axiom P1, the hypothesis 7, and 
Detachment. oO 


The second theorem uses axiom P2 and also involves a hypothesis, (H) > 
[(K) => (L)], which can be any formula of this form that has already been proved. 


1.13 Theorem (derived rule). For all well-formed formulae H, K, L, if 
(H) = [(K) > ()] 
is a theorem, then 
[((H) = (K)] => [(H) > ()] 
is also a theorem: P2, (H) = [(K) > (L)] § [(A) => (K)] => (A) = ()]. 


Proof: Apply axiom P2 and Detachment: 


F (H) = [(K) = (L)] hypothesis, 
F {(H) = [(K) = (L)]} = (A) = (K)] > (A) = ()]} axiom P2, 
F [(#) (K)] > (A) > ()] Detachment. 


oO 
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The third theorem (1.14) involves no hypotheses other than the axioms. Thus 
theorem 1.14 is a theorem of the Pure Implicational Propositional Calculus. More 
accurately, theorem 1.14 is a theorem schema representing a different theorem for 
each different formula P. The proof of theorem 1.14 is also a building block of the 
Deduction Theorem 1.22. 


1.14 Theorem (reflexive law of implication). For each well-formed propositional 
formula P, the formula 


(P) = (P) 
is a theorem derivable from axioms P1, P2, and Detachment. 


Proof. Apply axioms P1, theorem 1.13, and Detachment, with lines numbered for 
clarity (the line numbers are not parts of the proof): 


Q 
1 FCP )=> tl(P => @)I >( P)} substitution in axiom P1, 
H K T; 
H K H L 


—_ ——_—~ —_ —_ 
2 F{ P )S[(H>PO} > (0 P )=S>C P+) theorem 1.13, 


3 F(P)=> [(P) => ()] substitution in axiom P1, 
4 + (P) => (P) lines 2, 3, Detachment. 


A formal proof would replace line 2 by the details of the proof of theorem 1.13, 
labeled here as lines 2a and 2b, which gives the following proof: 


P Q R 
—_ — mr, —_ 
1 FC P )=>{[(P)>(P)]>( Pd} Pi, 
ay, 
Ss 
P Q R P Q P R 
—_ ——— —«_ —_ ——— —«_ —_ 
2a FC P )S{PSOHl=( P HSK P SYS Ops P SC Py) P2, 
SR I Rc i, nha 9 PS 
Ss U 
U 
2b EF {(P)>[P)>P)}S(P)>()] 1, 2a, MP, 
Vv Ww 
Vv 
——————_ 
4 F(P)=>[P)>()] Pil, 
5 L (P)=>(P) 2b, 4, M.P. 
es 


Ww 
Thus, P1, P2 (P) => (P). O 
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1.3.2. Derived Rules: Implications Subject to Hypotheses 


The following theorems about the Pure Positive Implicational Propositional 
Calculus are variants of the transitivity of implications: they shorten sequences of 
implications into one implication, from the initial hypothesis to the last conclusion. 
The same theorems also allow for the substitution of any well-formed propositional 
formula by any logically equivalent formula. Moreover, their proofs constitute steps 
of an algorithm known as the Deduction Theorem 1.22. For instance, the following 
derived rules of inference extend the rule of Detachment to situations where K and 
(K) = (ZL) might hold only under some hypothesis H: then the conclusion L also 
holds under the same hypothesis H. 


1.15 Theorem (derived rule). For all well-formed formulae H, K, L, if 
(A) => (K) and 
(H) = [(K) > ()] 
are theorems, then 
(H) => (L) 
is also a theorem. Thus, 
(H) = (K), (H) = [(K) > (L)] - (A) = (L). 


Proof. Apply theorem 1.13 and Detachment: 


1 F(A)> [(K) => 7) hypothesis; 

2 - [(A) => (K)] > [(A) => (L)] _ theorem 1.13, major premiss; 
3 F(A)= (kK) hypothesis, minor premiss; 

4 + (A)=>(L) lines 3, 2, and Detachment. 


oO 


The following theorem simplifies the use of theorem 1.15 if one of the logical 
implications is a theorem. In particular, in an implication (K) => (ZL), theorem 1.16 
allows for the substitution of a stronger hypothesis H implying (or equivalent to) K. 
Alternatively, in an implication (H) = (K), theorem 1.16 allows for the substitution 
of a weaker conclusion L implied by (or equivalent to) K. 


1.16 Theorem (derived rule). For all well-formed formulae H, K, L, if 
(H) => (K) and 
(K) = (L) 
are theorems, then 
(H) => (L) 
is also a theorem. Thus, (H) => (K),(K) => (L) - (A) => (ZL). 


Proof. Apply theorems 1.12 and 1.15: 


F (K) = (L) hypothesis, 
+ (H) > [(K) > (L)] _ theorem 1.12, 
t (H) = (K) hypothesis, 
r (H) => (L) theorem 1.15. 
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Similarly, the following theorem simplifies the use of theorem 1.15 if one of the 
components is already a theorem. 


1.17 Theorem (derived rule). For all well-formed formulae H, K, L, if 
(K) and 
(H) = [(K) > ()] 
are theorems, then 
(H) => (L) 
is also a theorem. Thus, K, (A) (kK) > (D)] - (A) => (). 


Proof. Apply theorems 1.12 and 1.15: 


-K hypothesis, 
t (H) = (K) theorem 1.12, 
- (H) => [(K) => (L)] _ hypothesis, 
F (H) = (L) theorem 1.15. 


oO 


The following theorem demonstrates how the rule of Detachment extends to a 
sequence of several logical implications. 


1.18 Theorem (derived rule). For all well-formed formulae P, Q, R, S, if 
(P) > (Q), 
(P) = [(Q) = (R)], and 
(P) = [(R) = (S)] 
are theorems, then 
(P) = (S) 


is also a theorem. Thus, 


(P) = (Q). (P) = [(Q) = (RB). (P) > [(R) = (S)] F (P) = 6S). 


Proof. Apply theorem 1.15 twice: 


F (P) = (Q) hypothesis, 
+ (P) => [(Q) = (R)] _ hypothesis, 
F (P) = (R) theorem 1.15, 
- (P) = [(R) > (S)] _ hypothesis, 
F (P) = (S) theorem 1.15. 


oO 


Similar theorems hold for a sequence of more than three consecutive implica- 
tions, but their need will not arise here. The following theorem simplifies the use of 
theorem 1.18 if two of the logical implications are already theorems. 


1.19 Theorem (derived rule). For all well-formed formulae P, Q, R, S, if 
(P) > (Q), 
(Q) = (R), and 
(R) = (S) 
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are theorems, then 
(P) = (S) 


is also a theorem. Thus, 
(P) = (Q), (Q) = (R), (R) > (S) F (P) = (8). 


Proof. Apply theorem 1.16 twice: 
+ (P) > (Q) hypothesis, 
- (Q) => (R) hypothesis, 
+ (P) => (R) _ theorem 1.16, Oo 
- (R) > (S) hypothesis, 
t+ (P) = (S) _ theorem 1.16. 


The following theorems demonstrate the transitivity of implications subject to a 
common hypothesis. 


1.20 Theorem (derived rule). For all well-formed formulae H, K, L, M, if 
(H) = [(K) > ()] and 
(H) = {(K) => [(L) > ()}} 
are theorems, then 
(H) = [(K) => (M)] 
is also a theorem. Thus, 


(A) = [(K) = (L)]. () = (K) = (DY) = DI} F(A) = [(K) = ()). 


Proof. Use axiom P2 with theorems 1.16 and 1.15: 


F (H) = {(K) => [(L) > ()]} hypothesis, 

F {(K) = [(L) = M)]}} = {[(K) = ()] > (Kk) = (W)]} axiom P2, 

F (H) = {[(K) => (L)] => [(K) => ()}} theorem 1.16; 
+ (H) = [(K) > (L)] hypothesis, 

F (HA) = [(K) > ()] theorem 1.15. 


oO 


1.21 Theorem (derived rule). For all well-formed formulae H, K, L, M, if 
(H) => (K), 
(H) => (LD), and 
(kK) = [(L) > (M)] 
are theorems, then 
(H) > (M) 
is also a theorem. Thus, 


(H) = (K), (A) = (L) (kK) = [(L) = (W)] F (A) = (M). 
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Proof. Apply theorems 1.12, 1.13, and 1.15: 


1 F(K)=> [1 => )] hypothesis, 

2 (A) = {(K) => [(Z) > ()} theorems 1.12, 

3 F[(A) > (K)] > (A) > [(Z) => ()]} theorem 1.13, major premiss, 
4 + (A) => (kK) hypothesis, minor premiss, 

5 FM>(O> WM) lines 4, 3, and Detachment, 
6 F(A)> hypothesis, 

7 -(A)> (M) theorem 1.15. 


1.3.3. A Guide for Proofs: an Implicational Deduction Theorem 


In general the question arises, how to find a proof of a theorem. One guide to 
design a proof of an implication (H) = (C), where H and C denote well-formed 
propositional formulae, begins by deriving a proof for a derivation H F C of the 
conclusion C from the hypothesis H and the axioms. For instance, all the proofs 
of derived rules in subsection 1.3.2 are examples of such derivations. The method 
for designing a proof then proceeds to “discharge” the hypothesis H to produce a 
proof of (H) => (C), as described in the proof of the Deduction Theorem 1.22, 
which is not in but about the implicational calculus. More generally, from any proof 
that a logical proposition S is derivable from proved hypotheses H, K,...,M,N the 
Deduction Theorem provides a recipe to turn that proof into a proof of 


(A) > {(K) => ...(M) => [W) = (S)]...}. 


The Deduction Theorem presented here is also a part of an algorithm to design 
proofs within the full Classical Propositional Calculus. 


1.22 Theorem (Deduction Theorem for the Pure Classical Propositional Calcu- 
lus, preliminary version). With any axiom system for which axioms P1, P2, and 
(P) = (P) are axioms or theorems (or schema thereof), there is an algorithm to 
transform any proof of 


H,K,....M,N-FS 
within the Classical Propositional Calculus into a proof of 
(H) > {(K) > ...(M) => [(N) => (S)]...}. 


Proof (Outline). The Deduction Theorem removes the hypotheses one at a time, for 
instance, beginning with the last one listed, here N, from all the steps in the proof. 
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(D1) If the step F P in the initial proof is the hypothesis N being removed, then 
in the new proof the Deduction Theorem replaces the old step 
+ N (current hypothesis) 
by a complete proof of (V) = (1), for instance, that of theorem 1.14: 


+ (N) > {[(N) > (N)] > (N)} axiom P1, 

FLW) = 1) > ] > 3] axiom P2, 
> ({(N) = [(Y) = ())}} > [) = W)) 

F {(V) => [M) = )]} > [(W) > W)] Detachment, 

F (N) > [(N) > (N)] axiom PI, 

F (N) => (N) Detachment. 


(D2) If a step F P of the initial proof is a substitution in one of the axioms, or 
in a previously proved theorem, or one of the hypotheses other than the one NV 
being removed here, then in the new proof the Deduction Theorem replaces the 
old step 

- P (axiom or hypothesis) 
by a complete proof of (V) = (P), for instance, as in theorem 1.12: 


+ P axiom or hypothesis, 
-F (P) > [(Y) > (P)]_ axiom P1, 
F (N) = (P) Detachment. 


(D3) Ifthe step + P is derived in the initial proof by Detachment from previously 
proved propositions M and (M) => (P), then (D2) and (D1) allow for their 
replacement by complete proofs of (V) = (M) and (VN) > [M) => 
(P)] respectively. Specifically, in the new proof, the Deduction Theorem then 
replaces the old steps 

- (M) => (P) (previously proven True), 


+ M (previously proven True), 
- P (Detachment), 
by a complete proof that (V) = (P), for instance, as in theorem 1.15: 
F (N) > [(M) => (P)] theorem 1.12, 
F {(N) => [(M) => (P)]} axiom P2... 
> {[(N) > (M)| > [(VY) => (P)]}___... continued, 
F {[(V) > (M)] => [(Y) => (P)]} Detachment, 
F (N) => (M) theorem 1.12, 
t (N) => (P) Detachment, 


with the proof of each instance of theorem 1.12 completely written out. 

(D4) If astep + P results from a previously proved derived rule, then it may be 
necessary first to replace the step F P by a complete proof of the derived rule, 
and only then to replace each step of the proof of the derived rule as instructed 
by directives (D1), (D2), (D3). 
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Still with the hypothesis N, after the completion of any operation (D1)—(D3) on step 
P, the Deduction Theorem then performs the same operations (D1)—(D3) on each of 
the following steps, Q,...,R. After the completion of operations (D1)-(D3) on all 
the steps P, Q,...,R, for the hypothesis N, the Deduction Theorem gives a proof of 


H,K,....M [(N) = (S)]. 


Then the Deduction Theorem repeats the whole process with the preceding hypothe- 
ses, H,...,M. The Deduction Theorem terminates with a proof of 


(H) > {(K) => ...(M) => [(N) => (S)]...}. 


The general case follows by several applications of the previous cases, in a way 
that may be specified more explicitly after the availability of the Principle of 
Mathematical Induction in chapter 4. Oo 


Example 1.23 shows how to use theorem 1.22. 


1.23 Example (Tarski’s axiom II). To prove Tarski’s axiom II, {(P) => [(P) > 
(Q)]} > [(P) => (Q)), define H and C by 


{P) = (7) > @} > [F) > OI. 
—— — 


vel Cc 


Phase 1: a proof of HF C. 


For H § C, in other words, to derive C from H, substitute H in axiom P2: 
1 F (P)=[(P)>(Q)] hypothesis, 
———SI ee 


H 
2 FIP)S(P)SOlsslPH>Pl=(P)> Ql] axiom P2, 
—_eC“_—o- —_—_—ns Ooo” 


AH L 
L 


————_$_—” 
3 [(P)=> (P)]=>[(P)>(Q)] 1, 2, Detachment, 
———— ———— 


K Cc 
4 F (P)=(P) theorem 1.14, 
—S——’ 


K 
5 F (P)=(Q) 3, 4, Detachment, 
—_SE—— 


Cc 
which completes the derivation HF C of C from H. 
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Phase 2: a proof of (H) => (C) from HEC. 


To transform this derivation (H | C) into a proof of (H) => (C), apply to each step 
of phase | the procedure described in the Deduction Theorem (1.22): 


(P)=(P)=QiStP)=[P)> Qi (D1), 
——_~_ ——_~__ 
: AH 
a—jaeaeoe FF 
F {(P)=[(P)>(Q)]} 


>((PslP)>(OisP)>P)]>(P)>(@)]}) (D2), P2, 
———_, a 


H L 
a: sar a: er 
{(P)=[P)=(QiStlP)= P= [(P)=> (Q)]} (D3), 
—— ———— 
K Cc 
H 
——_ 
{(P)>[(P)> (Qo lP)=> (P)] (D2), 1.12, 
—— 
K 
a ae 
{(P)=[(P)> (QQ) [(P)=> (Q)] (D3), 
—_——" 


Cc 
which completes the proof of {(P) => [(P) > (Q)]} > [(P) > (Q)]. A completely 
formal proof would also expand each of the steps just listed into its own proof, 
as specified in the proof of the Deduction Theorem 1.22, for instance, expanding 
each use of the directive (D3), which uses theorem 1.15, into a complete proof of 
theorem 1.15. 


Although theorem 1.22 can produce lengthy proofs, the resulting long proofs can 
also suggest shorter proofs, for instance, as in theorem 1.24. 


1.24 Theorem. The formula {(P) => [(P) > (Q)]} => [(P) => (Q)] is a theorem. 
Proof. Apply axiom P2 with theorems 1.14 and 1.17: 


F (P) = (P) theorem 1.14, 
5S 
K 
F {(P) > [(P) >= @]}} = P) => (P)] = (P) > (Q)]} axiom P2, 
a ern” ——— — 
H K L 
F {(P) > [(P) >= @}} = [(P) = @)] theorem 1.17. 
—— —— 
AH L 


Theorem 1.24 is also used in the form of a derived rule. 


1.25 Theorem (derived rule). /f(P) > [(P) > (Q)] is a theorem, then (P) > 
(Q) is also a theorem. 
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Proof. Apply theorem 1.24 and Detachment: 
F {(P) = [(P) = Q)]}} = [(P) > (@Q)] _ theorem 1.24, 
———— ———_ 


H Cc 
F (P) = [(P) > (Q)] hypothesis, 
—__———_—_—_——" 
H 
r (P) > (Q) Detachment. 
ae 


Cc 
Oo 


The design of proofs of formulae of the form (A) => [(B) => (C)] can start with a 
derivation A, BF C of C from two hypotheses A and B. With A treated as an axiom, 
and B as the hypothesis H, a first application of the Deduction Theorem (1.22) 
leads to a derivation A - (B) => (C). Then, with A as the hypothesis H, a second 
application of the Deduction Theorem (1.22) yields a proof of (A) > [(B) > (C)]. 

The general case with several hypotheses follows by several applications of the 
Deduction Theorem (1.22), in a way that may be specified more explicitly after the 
availability of the Principle of Mathematical Induction in chapter 4. 


1.3.4 Example: Law of Assertion from the Deduction Theorem 


The following proof shows the use of the Deduction Theorem in designing proofs. 


1.26 Theorem. The law of assertion (A) = {[(A) => (B)] => (B)} is a theorem. 


Proof. A finished proof can proceed as follows: 


F [(A) => (B)] > [(A) > (B)] theorem 1.14, 
F (A) = {[(A) > (B)] [(A) => (B)]} theorem 1.12, 
t (A) > {[(A) > (B)] > (4)} axiom P1, 

t (A) > {[(A) > (B)] > (B)} theorem 1.20. 


The following considerations explain how to design such a proof. 
The formula (A) => {[(A) (B)| = (B)} has the pattern (H) > [(K) => (S)] 
of the Deduction Theorem, with A for H, (A) => (B) for K, and B for S. 


Step 1. 


As in the Deduction Theorem, assume first that the hypotheses H and K are proved, 
and from them derive the conclusion S by proving H, K  S. Here, assume that the 
hypotheses A and (A) => (B) are both proved, and prove A, [(A) > (B)] F B: 

FA first temporary hypothesis, 

- (A) = (B) second temporary hypothesis, 

EB Detachment. 
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The foregoing derivation shows that if A and (A) = (B) are proved, then B is 
proved. Still under the first hypothesis A, the Deduction Theorem allows for the 
removal of the second hypothesis, (A) = (B), as follows. 


Step 2. 


Step 2.1 


The first line in step I consists of the other hypothesis, A, which is assumed proved, 
whence instructions (D2) in the Deduction Theorem replace A with a complete proof 
of [(A) > (B)] => (A) as in theorem 1.12. In other words, replace the first line, F A, 
by the following three lines: 

Proof of theorem 1.12: 


FA temporary hypothesis, 
F (A) => {[(A) > (B)] => (A)} axiom PI, 
— | -—$— 
Q 
F [(A) = (B)] = (A) Detachment. 


End of proof of theorem 1.12. 


Step 2.2 


Similarly, the second line in step I consists of the hypothesis K being currently 
removed, here (A) = (B), which instructions (D1) in the Deduction Theorem 
replace with a complete proof of (K) = (K), here [(A) > (B)] [(A) (B)], 
as in theorem 1.14. Thus, replace the second line, F (A) => (B), by the following 
lines: 


Proof of theorem 1.14 
with [K] for [(A)=>(B)]: 


F [K]=> ({[K]>[K]}}=1[X]) axiom P1, 

+ {[K]=> ({[K]=>[K]}}=>Ik])} axiom P2... 
[([K]=>{[K]=[K]})>{[K]=>[K]}] — ... continued, 

+ ([K]=> {[K]>[K]}})>{[K]=>[k]} Detachment, 

F [K]=>{[K]=>[K]} axiom P1, 

F [K]>[K] Detachment, 

F [(A)=>(B)]=>[(A)> (B)] substitution. 


End of proof of theorem 1.14. 
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Step 2.3 


Finally, the third line in step I invokes Detachment, which instructions (D3) replace 
by an instance of (the proof of) theorem 1.15: 


 [(A) = (B)] = (A) step 2.1, 
F [(A) > (B)] > [(A) = (B)] _ step 2.2, 
F [(A) => (B)] > (B) theorem 1.15. 


Hence the proof no longer assumes (A) = (B) as a hypothesis, but it still 
assumes A as a hypothesis, thus proving that 


AF {[(A) = (B)] = (B)}. 


Step 3. 


Finally, the Deduction Theorem allows for the removal of the first hypothesis, A, 
from step 2. Here step 2 consists of steps 2.1, 2.2, and 2.3. 


Step 3.1 


In step 2.1 the first line consists of this hypothesis, A, whence instructions (D1) 
replace A with (A) = (A) by a complete proof of theorem 1.14. In other words, 
replace the first line in step 2, F A, by the following lines: 

Proof of theorem 1.14: 


+ [A] = ({[A] = [A]} => [A]) axiom Pl, 

+ {[A] => ({[A] = [A]} = [A])} axiom P1... 
[ ([A] = {[A] = [A]}) = {[A] = [A]}] — ... continued, 

+ ([A] = {[A] = [A]}) = {[A] => [A]} Detachment, 

- [A] => {[A] => [A]} axiom Pl, 

F [A] => [A] Detachment. 


End of proof of theorem 1.14. 


Step 3.2 


The second line in step 2.1 is an instance of axiom P1, which instructions (D2) 
replace by (A) = [(A) > {[(A) > (B)] = (4)}]. 


Step 3.3 


The third line in step 2.1 yields [((A) > (B)] => (A), from Detachment, which 
instructions (D3) replace by a complete proof of (A) > {[(A) > (B)] > (A)}, 
as in theorem 1.15. In this case, however, such a proof would be correct but not 


www.pdfgrip.com 


1.3. The Pure Positive Implicational Propositional Calculus 21 


necessary, because (A) {[(A) (B)| = (A)} is merely an instance of axiom 
P1. Because it is an axiom, all the preceding lines also become superfluous. 


Step 3.4 


The result of step 2.2, [(A) (B)] [(A) (B)], is proved by theorem 1.14. 
Hence, instructions (D2) replace it by (A) = [(A) > (B)] => [(A) > (B)]. 


Step 3.5 


Fully written out, the remaining lines in step 2.3 would follow the proof of 
theorem 1.15. Removing the hypothesis A then amounts to theorem 1.20, which 
forms the last line of the final proof: 


F (A)=>{[(A)>(B)]=>(A)} axiom P1 (from 3.3, replacing 2.1), 

F [(A)=>(B)]=>[(A)=> (B)] theorem 1.14 (from step 2.2), 

F (A)={[(A)=>(B)]=>[(A)=>(B)]} theorem 1.12 (from 3.4, replacing 2.2), 

F (A)=>{[(A)>(B)]>(B)} Theorem 1.20 (from 3.5, replacing 2.3). 
Thus the Deduction Theorem has provided some guidance for the construction 

of a proof of the theorem (A) > {[(A) > (B)] > (B)}. Oo 


1.3.5 More Examples to Design Proofs of Implicational 
Theorems 


In patterns of deductive reasoning involving two premisses, the following theorems 
confirm that the order of the premisses does not matter. 


1.27 Theorem (transitive law of implication). The following formula is a theo- 
rem. 


[(Q) > (R)] > 1) > Q)] > [P) => (A). 


Proof. The formula to be proved has the form (A) => [(B) = (C)], with A denoting 
(Q) => (R), B denoting (P) > (Q), and C denoting (P) => (R): 


A B Cc 


— —$<———— ——————— 
[(Q) = (R)] > t1) > (Q)] = [(P) > I}. 


Phase 1: deriving A, B F C and discharging B. 


Designing a proof of (A) > [(B) > (C)] can start with a derivation A,B F C, in 
this case (Q) => (R), (P) > (Q) F (P) => (R), which is exactly theorem 1.16. 
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Expanding the proof of theorem 1.16 which invokes theorems 1.12 and 1.15, 
helps discharging the hypothesis (P) > (Q): 


F (Q) = (R) hypothesis A, 
- (P) = [(Q) > (R)] theorem 1.12, 
F (P) => (R) conclusion C, by theorem 1.15. 


The preceding three steps form a derivation A,B - C of C from A and B, 
but without invoking B, which plays a hidden rdle in the proof of theorem 1.15. 
Replacing the citation of theorem 1.15 by its proof leads to a derivation of (B) > 
(C) from A: 

F (Q) => (R) hypothesis A, 
F (P) > [(Q) > (R)] theorem 1.12, 
F [(P) => (Q)] => [(Q) (R)] (B) => (C) by theorem 1.13. 

Replacing the citation of theorem 1.13 by its proof, which uses only axioms and 
Detachment, gives a derivation of (B) => (C) from A directly from the axioms: 


1 F (Q)=(R) hypothesis A, 

2 F (P)>[(Q)>(R)] theorem 1.12, 

3 FUP)=[Q)>=(RS>IP)> (QS ((P)(R)]} axiom P2, 

4 F[(P)>(Q)] = [(P)>(R)] 2, 3, Detachment. 


Lines 1-4 complete the derivation of [((P) > (Q)] > [(2) => (R)] from the first 
hypothesis (Q) = (R) with axiom P2 and Detachment. Since the second hypothesis, 
(P) => (Q), has not been used, it need not be discharged. 


Phase 2: discharging A. 


An application of the Deduction Theorem (1.22) discharges the first hypothesis and 
yields a proof of (A) = [(B) => (C)]. The resulting proof can be shortened, or 
alternative proofs may result from trial and error. To this end, H, K, L refer to 
theorem 1.15: 


F [(Q) = (R)] = {(() = [@Q) => (A} axiom P1, 
A K 


F{(P) = (Q) = @I} > (IP) = @Il > [P) = (®)]} axiom P2, 


K L 
F [(Q) = (R)] = tl) > @)] = [P) > ®@)]} theorem 1.15. 
A L 


Swapping the premisses (P) => (Q) and (Q) => (R) also yields (P) => (R): 


1.28 Theorem (transitive law of implication, law of the hypothetical syllogism). 
The following formula is a theorem: 


[(P) = (Q)] > t[(Q) = (R)] > [P) > (IJ}. 


www.pdfgrip.com 


1.3. The Pure Positive Implicational Propositional Calculus 23 


Proof. With notation as in the proof of theorem 1.27, the formula to be proved has 
the form (B) = [(A) = (C)], with A denoting (Q) => (R), B denoting (P) > (Q), 
and C denoting (P) => (R): 
B A Cc 
——~ —+—~ —_—_. 
[(P) > (Q)] > t1(Q) > (R)] > [P) = (A). 


Thus designing a proof of (B) = [(A) => (C)] can start with a derivation B,A F C, 
in this case (P) > (Q),(Q) => (R) F (P) => (R), which is exactly theorem 1.16. 
Hence steps as in the proof of theorem 1.27 discharge the hypotheses, but in the 
reverse order. The resulting proof might then be shortened or give clues for a shorter 
alternative proof. For example, apply axiom P1 with theorems 1.27, 1.13, and 1.16: 


F [P) > Q))] = (@ => ®] = [(/) > @} axiom P1, 
—— — — 
K H K 
F {(Q) = (R)} = {[(P) > (Q)] = [P) = (A)]} 1.27, 
—_S ———>S — ——’ —_=—— 
H K L 
H K H L 
——_— ——— — — 
F IQ) > @l > (PF) > @i > (OQ >= @] > (MH) > WY} 1.13, 
V W 
U Ww 
a | 
F [(P) > (Q)] = {[(Q) => (®)] > [P) = (R)]} 1.16. ; 


1.3.6 Another Guide for Proofs: Substitutivity of Equivalences 


Besides the Deduction Theorem (theorem 1.22), another guide to design proofs 
consists of replacing any occurrence of a formula by an equivalent formula, thanks 
to theorem 1.29 [18, p. 101, 124, 189], [108, p. 48]. 


1.29 Theorem (Substitutivity of Equivalence in the Pure Positive Implicational 
Propositional Calculus, preliminary version). For all well-formed implicational 
logical formulae U and V, if (U) = (V) and- (V) => (U), and if a formula Q 
results from substituting any (zero, one, several, or all) occurrence(s) of U by V in 
a well-formed formula P, then (P) = (Q) and (Q) => (P). 


Proof (Outline). This proof proceeds by cases and subcases. 
In all cases, if Q is P, which results by substituting none of the occurrences of U 
by V, then each of (P) = (Q) and (Q) => (P) is (P) => (P), which is theorem 1.14. 
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ee 


. If P is U, then Q is either U or V. If Q is V, then (P) = (Q) and (Q) > (P) 
become (U) = (V) and (V) = (U), which are the hypotheses. 

If Pis (U) > (W), then Q is either (U) = (W), which is P, or Q is (V) => (W). 
If O is (V) = (W), then the hypothesis F (V) = (U) yields (P) => (Q): 


N 


F (V) = (U) hypothesis, 

F [(V) > U)] = (UY) > (W)] = [(V) > (W)]} theorem 1.28, 

F (UU) > (W)] > [(V) = (W)] Detachment. 
ee —— 

P Q 
Similarly, the hypothesis / (U) = (V) yields F (Q) => (P): 

F (U) = (V) hypothesis, 

F [(U) > (V)] > {[(YV) => (W)] => [(U) => (W)]} theorem 1.28, 

F [(V) > (W)] > [(U) > (W)] Detachment. 
—— _———— 


Q P 

. If Pis (W) => (U), then Q is either (W) > (U) or (W) => (V). If Gis (W) > 

(V), then P with (U) => (V) yield Q, and Q with (V) => (U) yield P, by 
transitivity (theorem 1.16). 

If Pis (W) => (UV), then Q is either (W) => (U), which is P, or Q is (W) > 

(V). If Q is (W) = (V), then the hypothesis F (V) = (U) yields + (Q) => (P): 


io’) 


F (V) = (U) hypothesis, 

L [(V) = (U)] > {(W) > (V)] > [(¥) > (U)]} theorem 1.27, 

- [(W) > (V)] > [W) (U)} Detachment. 
—— —— 


Q P 
Similarly, the hypothesis F (U) => (V) yields F (P) > (Q): 


tr (U) = (V) hypothesis, 
F [U) > (V)] > {(W) = (U)] => [(W) = (V)]} theorem 1.27, 
F [(W) > (V)] > [(W) = (V)] Detachment. 
_—_—_—_— _—_—_—_— 
P Q 


The general case follows by several applications of the previous cases, in a 
way that may be specified more explicitly after the availability of the Principle of 
Mathematical Induction in chapter 4. oO 


1.30 Example. If U denotes (H) => (K), and V denotes (H) => [(H) > (K)], then 
F (U) => (V) and (V) > (U): 
U Vv 


—__—-=— —_—_—_— 

H(M => ®ls(M=>(H=> ®)} axiom, 

+ {(H) => [(A) (K)]} [(H) => (K)] theorem 1.24. 
a —S_ 


Also, if P dennis (L) > [((A) > (K)I, which is (L) > (U), then (P) > (Q) 
and (Q) => (P) become 
P Q 
— ew ————————— 
F {(L) = [(A) > (K)}} > [© = (4) = [(A) = (K)]}}] theorem 1.29, 
F(Z) = (A) = [(4) > (K)}}] > (© = [(A) = (A)]} theorem 1.29, 
P 


Q 
by axiom P1 with theorems 1.24 and 1.29. 
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1.3.7 More Derived Rules of Inference 


The following derived rules allow for substitutions within implications subject to 
hypotheses, for instance, a substitution within an intermediate hypothesis. 


1.31 Theorem (derived rule). For all well-formed formulae H, K, L, M, if 
(H) = [(L) > (M)|_ and 
(K) = (L) 
are theorems, then 
(H) = [(K) => (M)] 
is also a theorem. 


Proof. Apply theorems 1.28 and 1.16: 


F (K) = (L) hypothesis, 

F [(K) = (L)] = tL) = (Y)] = [(K) = (M)]} theorem 1.28, 

F [(L) > (M)] = [(K) > ()] Detachment. oO 
F (H) = [(L) > (M)] hypothesis, 

F (H) = [(K) = (M)] theorem 1.16. 


The second theorem allows for a substitution in the conclusion. 


1.32 Theorem (derived rule). For all well-formed formulae H, L, M, N, if 
(H) = [(L) = (M)] and 
(M) = (N) 
are theorems, then 
(H) => [(L) => (N)] 
is also a theorem. 


Proof. Apply theorems 1.28, 1.16, and 1.17: 
F (H) = [(L) > (M)] hypothesis, 
— 


. 
F [(L) > (M)] > {[W) = (WY) => [((D = (N)]} theorem 1.28, 


P Q 
F (H) > {[(M) > (N)] [(L) > (N)]} theorem 1.16, 
i ee ee ea 
Q 
+ (M) => (N) hypothesis, 
F (A) > [(L) > ()] theorem 1.17. 


oO 
The following three derived rules of inference will simplify subsequent proofs. 


1.33 Theorem (derived rule). Jf (K) = (ZL) is a theorem, then the following 
formula is also a theorem: |(H) > (K)] > [((A) > (D)]. 
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Proof. Apply the transitivity of implication in the form of theorem 1.28: 


F (K) = (L) hypothesis, 
F [(K) > (L)] => {((A) = (K)] => (A) > ()]} theorem 1.28, 
F [(A) (K)] > (A) > ()] Detachment. 


oO 


1.34 Theorem (derived rule). Jf (1) = (H) is a theorem, then the following 
formula is also a theorem: |(H) => (K)] > [() => (K)]. 


Proof. Apply the transitivity of implication in the form of theorem 1.27: 


F (1) = (A) hypothesis, 
F [D) = (A)] > t[() = (K)] = [() = (K)]} theorem 1.27, 
F (A) > (K)] > [) => (K)] Detachment. 


oO 


1.35 Theorem (derived rule). /f (A) = (B) and (C) => (D) are theorems, then 
the following formula is also a theorem: |(B) => (C)| = [(A) > (D)]. 


Proof. Apply theorems 1.34 and 1.33, with H, K, Las in theorem 1.16: 
F (A) => (B) hypothesis, 


F [(B) => (C)] => [(A) (C)| theorem 1.34, 


A K 


F (C) > (D) hypothesis, 


F [(A) => (C)] = [(A) > (D)] _ theorem 1.33, 


F [(B) * (0) > [(A) > (0) theorem 1.16. 
H L 


oO 
In contrast to the preceding derived rules, theorem 1.36 reveals a different pattern. 


1.36 Theorem (derived rule). /f [((H) => (L)]| = (M) and (A) = (K) are 
theorems, then the following formula is also a theorem: [(K) => (L)| > (M). 


Proof. Apply theorems 1.27, 1.13, 1.12, and 1.16: 
F[(K)=(0)=D= (= (A= I} 1.27, 
_——— —_— —— 


P Q R 
FM Os (De MO}s(M=Ol=(M=D]} 1.13, 
—— —— —_—_— —_— 


P Q P R 
F [(H)=>(K)] hypothesis, 
— 
Q 
F [(K)=>(L)]=>[(A)=>(k)] 1.12, 
—-——” —_-— 


FP Q 
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F[()=S(0|=>(M=>()] Detachment, 


r (iH) (L)|=(M) : hypothesis, 
—_———” 
R 
 [(K)=(L)|=(™) 1.16. 
———” 
P 


1.3.8 The Laws of Commutation and of Assertion 


The following “Law of Commutation” allows for yet another change in the order of 
hypotheses: 


t(P) = [(Q) = (R)]} > {(Q) = [(P) = (RI. 


The Law of Commutation is one of Frege’s axioms [39, p. 146, eq. (8)]. In Frege’s 
1879 words, the Law of Commutation states that 


If a proposition is the consequence of two propositions, their order is immaterial [39, 
p. 147]. 


In 1935, however, Lukasiewicz showed that the Law of Commutation is a theorem 
derivable from Detachment with axioms Pl and P2 [80, p. 127], as proved by 
theorem 1.37. (The time lapse between 1879 and 1935 indicates that recognizing 
whether a formula is a theorem can also be difficult for specialists.) 


1.37 Theorem (law of commutation). The following formula is a theorem: 


{(P) = [(Q) = (R)I} > (Q) = [(P) = (Ij. 


Proof. Apply theorem 1.31: 
F{(P) = (Q = @I} > IP) > @1 > () = (R)]} axiom P2, 
ee ee ——— 


H L M 
F( QQ )S3[PH)>Q)] axiom P1, 
—— —_—_— 
K L 
F {(P) = [(Q) = (AM >t{C Q@ )> [P) => (R)J} theorem 1.31. 
H K M 


oO 


With a shorter proof relying on the law of commutation, the following theorem 
combines the rule of Detachment into a single formula. 
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1.38 Theorem (law of assertion). The formula (H) {[(A) (C)] (C)} is 
a theorem. 


Proof. Apply theorems 1.14 and 1.37 with Detachment: 
P P 


———— TT 
F[(A)=>(C]}>[C H )=>C C _)I theorem 1.14, 


P Q R 

F( H )=> {{(A) > (C)] => ( commutation (1.37) and Detachment. 
Q P R 

Oo 


Also relying on the law of commutation, the following derived rule will shorten 
the proof of subsequent results in particular, theorem 1.50. 


1.39 Theorem (derived rule). /f/ (U) > [(V) > (W)] 
and (H) => [(W) => (R)], 
then (H) = {(U) > [(V) => (R)]}. 


Proof. Apply the transitivity of implication (theorem 1.27): 


F[(VY)=(W)J={[(W) > (R= (VY) > (RIF theorem 1.27, 

F U)>((Y>(W)1> {[(W)>(R)1>[(V)>(R)]}) theorem 1.12, 

F U)=[(VV)>(W)] hypothesis, 

F (U)={[(W)=> (R= (VY) => (R)]} theorem 1.15, 

F {(U)=>[(W)>(R)]}} > {(O=(VY)> (RF theorem 1.13, 

F (H)=>[(W)=>(R)] hypothesis, 

F (U)={((A)=[(W)>(R)]} theorem 1.12, 

F (A)=>{(U)=>[(W)>(R)]} commutation (1.37), 
F (A)={(U)=>[(V)=>(R)]} transitivity. 


1.3.9 Exercises on the Classical Implicational Calculus 


The foregoing theorems involve only implications but no negation, and their 
proofs do not involve any negation either. Nevertheless, there exist other theorems 
involving only => but not — for which there does not exist any proof involving only 
implications. Examples of such theorems are hidden in the exercises, to be revealed 
later. Investigate whether the formulae in the following exercises are theorems, using 
any of the axioms P1 and P2, any rules of inference, and any of the theorems just 
proved. 


11. (MD) > DO) > tA = [(K) = Ob 
12. (kK) => OY) > (= (kK) = MO} 
13. [(A) > (B)] > [{14) > @] > B} > @B] 
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1.4. [(A) > ()] > [{(4) > ®@] > A} > A] 

1.5. {[(P) > (P)] > (P)} => (P) 

1.6. (P) => {[(P) > (P)] > ()} 

1.7. {[(P) => (Q)|] > (P)} > (P) C@eirce’s law.) 

18. [(P) > (@)] > [{P) > Q] > ®@} > @®)] 

19. (2) > QI > PM) > IR) > Q@] => [() > Mp 
1.10. [(R) => (Q)] > ({[(R) > (Q)] > (P)} = [(S) => (P)]) 


1.3.10 Equivalent Implicational Axiom Systems 


The Classical Implicational Calculus just presented rests on the rules of Detachment 
and Substitution with Frege’s axioms P1 and P2: 


Frege’s axiom Pl (P) > [(Q) => (P)]. 
Frege’s axiom P2_ {(P) = [(Q) = (R)]} = {[(P) > (Q)] = [(P) > (®)]}- 


Other selections of axioms exist, for instance, Stephen Cole Kleene’s [72, p. 15], 


Kleene’s axiom la_ (A) > [(B) => (A)], 
Kleene’s axiom 1b__[(A) > (B)] > (A) > [(B) > (O}} [(A) > (C)]), 


and Tarski’s [129, p. 147], 


Tarski’s axiomI (P) => [(Q) > (P)], 
Tarski’s axiom IT {(P) = [(P) = (Q)]} = [(P) > (Q)], 
Tarski’s axiom IIT [(P) => (Q)] = {[(Q) = (R)] = [(P) = (R)J}- 


Frege’s, Kleene’s, and Tarski’s implicational axiom systems are mutually equiva- 
lent, in the sense that each system leads to the same Pure Positive Implicational 
Propositional Calculus. 

Indeed, their first axioms, PI, la, and I are mutually identical. 

Second, Kleene’s axiom 1b results from applying the law of commutation 
(theorem 1.37) to Frege’s axiom P2. Consequently, both of Kleene’s axioms la 
and 1b are theorems derivable from Frege’s axioms P| and P2. Thus, prepending 
derivations of Kleene’s axioms from Frege’s axioms to any proof of any theorem 
from Kleene’s axioms yields a proof of the same theorem from Frege’s axioms. 
Therefore, every theorem derivable from Kleene’s axioms is also derivable from 
Frege’s axioms. 

Similarly, Tarski’s axiom II is theorem 1.24, while Tarski’s axiom III is the- 
orem 1.28. Consequently, all three of Tarski’s axioms I, II, and III are theorems 
derivable from Frege’s axioms P1 and P2. Therefore, every theorem derivable from 
Tarski’s axioms is also derivable from Frege’s axioms. 
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The exercises establish the converse derivations. From Kleene’s axioms, 
exercises 1.11 and 1.13 derive Tarski’s axiom II, whereas exercises |.14—1.22 derive 
Tarski’s axiom III. Thus exercises |.11-—1.22 show that every theorem derivable 
from Tarski’s axioms is also derivable from Kleene’s axioms. Hence from Tarski’s 
axioms, exercises |.23—1.32 derive the law of commutation following Tarski’s 
outline [129, p. 148-149], and thence Frege’s axioms P2 from Tarski’s axiom III. 
Thus exercises 1.23—1.36 show that every theorem derivable from Frege’s axioms 
is derivable from Tarski’s axioms, and thus also from Kleene’s axioms. 

However, after the proof of (P) = (P) in theorem 1.14, Frege’s axioms lead 
immediately to the Deduction Theorem (1.22), whereas the exercises reveal that 
several intermediate theorems stand between Kleene’s or Tarski’s axioms and the 
Deduction Theorem, which may explain the popularity of Frege’s axioms, already 
announced before definition 1.4 on page 6. 


1.3.11 Exercises on Kleene’s Axioms 


For the following exercises, use only Kleene’s axioms la and 1b with Substitution 
and Detachment. 


1.11. Prove (P) => (P). 


1.12. Establish the derived rule of inference that if T is a theorem, then (S) > (7) 
is also a theorem. 


1.13. Prove {[(P) > [(P) > (Q)]} = [P) > @)]. 


1.14. Establish the derived rule of inference that if (H) = (K) and (K) => (L) are 
theorems, then (H) = (L) is also a theorem. 


1.15. Establish the derived rule of inference that if T is a theorem, then {(A) > 
[(T) > (C)]} => [(A) > (©)] is also a theorem. 


1.16. Prove [(B) > (C)] > {(A) > [B) > (O}}. 
1.17. Prove 


{(B) > ©] (4) + [B) > OF > (4) > (C)} 
{(B) > ©] > (A) > OB. 


1.18. Prove 


B= O)]=(A)> Bp 


(+ Ol {14> @1>+((4)>1B) +O} +14) ()}) 


{[B)>+(Ol> (A) 18) > OFS) >(C)))t | 
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1.19. Prove 


{((B) = (C)] = (A) > BY} 
> {(B) > (©) > (KA) + (8) > ©} > (4) > ©) 


1.20. Prove 


IB) = ©] = (4) = @®i > IB = ©] = (4) = ©}. 


1.21. Prove [(A) > (B)] > {[(B) > (C)] = [(A) = BIJ}. 
1.22. Prove [(A) > (B)] > {[(B) > (C)] = (A) > (O}}. 


1.3.12. Exercises on Tarski’s Axioms 


For the following exercises, use only Tarski’s axioms I, II, and III, with Substitution 
and Detachment. 


1.23. Establish the derived rule of inference that if (H) = (K) and (K) => (L) are 
theorems, then (H) => (L) is also a theorem. 


1.24. Establish the derived rule of inference that if T is a theorem, then (S) > (7) 
is also a theorem. 


1.25. Prove (P) {[(P) (Q)| > (P)}. 
1.26. Prove 


{[(P) = (Q)] = (P)} > (IP) > @)] > 1) > 1 > (Q)}). 


1.27. Prove 


(P) = ([(P) > (Q)] > {[(P) > (Q)] > (Q)}). 


1.28. Prove 


([(P) > (Q)] > TP) > (Q)] > (Q)}) > [(P) > )] > )}. 


1.29. Prove the law of assertion: 


(P) = lP) = Q] > OQ}. 
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1.30. Prove 


{(P) > [(Q) > (R)]} > ({1(Q) = (R)] > (R)} = [(P) = (8). 


1.31. Prove 


({[(Q) = (R)] = (R)} = [(P) = (R)]) = ((Q) = [P) = (Rh. 


1.32. Prove the law of commutation: 


{(P) = (Q) = (Ik > (Q) = [P) = (Ii. 


1.33. Prove Frege’s axiom 2: 


[(Q) > (R)] > 1) > Q)] > [P) = (A). 


1.34. Prove 


{(Q) = [(P) > (RB > ([P) > @)] > {P) = [P) > @)]}). 


1.35. Prove 


[(P) > (Q)] > ({(P) > [@) = (R)}} > [(P) > (3). 


1.36. Prove 


(P) = (2) = (Ii > IP) >= Q1 > [(P) = I}. 


1.4 Proofs by the Converse Law of Contraposition 


The Pure Positive Implicational Propositional Calculus belongs to several logical 
systems, which differ from one another by their different axioms about negation. 
For instance, classical logic defines its concept of negation by the converse law of 
contraposition: 


Axiom P3: {[-(Q)] > [-(P)]} => [(P) > (Q)]. 


1.4.1 Examples of Proofs in the Full Propositional Calculus 


The following proofs demonstrate the use of the converse law of contraposition. 


www.pdfgrip.com 


1.4 Proofs by the Converse Law of Contraposition 33 


1.40 Theorem (law of denial of the antecedent). For all well-formed formulae P 
and Q, the following formula is a theorem: [=(P)] => [(P) => (Q)]. 


Proof. Apply axioms P1 and P3 with the transitivity of implication (theorem 1.16): 


F [=(P)] => {[-(Q) > [A (P)]} substitution in axiom P1, 
F {[-@) = [-@)B => [(P) = (Q)] axiom P3, 
F [=(P)] = [(P) > (Q)] theorem 1.16. 
oO 
1.41 Theorem. The formula (P) => {[=(P)] => (Q)} is a theorem (schema). 
Proof. Apply theorem 1.40 and the law of commutation (theorem 1.37): 
[-(P)] > [P) > ()] theorem 1.40, 
F {(P)] > [?) = @} = [) > (-@)] = (@}] commutation (1.37), 
F (P) => t[-()] = (Q)} Detachment. 
oO 


The following two theorems establish the complete law of double negation. 


1.42 Theorem (law of double negation). The formula [--(P)] => (P) is a 
theorem (schema). 


Proof. Apply the transitivity of implication (theorem 1.19) and theorem 1.25: 


F {>[7(P)]}} > {[>>77(P)] = [--(P) 3 axiom P1, 
F ({>[-77(P)]}} = {-[-(P)]}) > {[-@)] => [--7(P)]}_— axiom P3, 
F ([-(P)] > {-[--(P)}) = {[--(P)] = (P)} axiom P3, 
F [=>(P)] = {l--(P)] = (P)} theorem 1.19, 
F [--(P)] => (P) theorem 1.25. 


oO 


1.43 Theorem (converse law of double negation). The formula (P) => [=—(P)] 
is a theorem (schema). 


Proof. Apply the law of double negation (theorem 1.42) and contraposition (P3): 


F (-{-[-(P)}}) > [F()] theorem 1.42, 
F {(>{-[-(P)}}) = F(P)]} = [(P) => {-[1-@)}}] axiom P3, 
F (P) > {-[-(P)]} Detachment. 


oO 
With axiom P3, the following theorem gives the complete law of contraposition. 


1.44 Theorem (law of contraposition, principle of transposition). The follow- 
ing formula is a theorem (schema): [(P) => (Q)] > {[-(Q)] > [-(@)]}. 


Proof. Apply transitivity (theorem 1.16) with A, B, C, D as in theorem 1.35: 
F {-[-(P)]}} > ( P) _ theorem 1.42, 
—S——— —_— 


A B 
F( Q@ )={-|-(Q)]} — theorem 1.43, 
——— —S—— 
C D 
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FP) (2 1 te) = [e2@Q) theorem 1.35, 
B C A D 
F {l-7(P)] = [--(@Q)]}} = -@)] = FFB axiom P3, 
F [(P) >= Q)] > t-@Q] = F)p theorem 1.16. 
Oo 


Theorem 1.44 is the theoretical basis for reasoning by contraposition. Theo- 
rem 1.45 syncopates several steps for later use. 


1.45 Theorem. For all well-formed formulae P and Q, the following formula is a 


theorem: {{=(P)] = (Q)} = {[>(Q)] > (P)}. 


Proof. Apply the principle of transposition and the law of double negation: 
F {[-()] > (Q)} > ([-@)] = {-[-@)]}) _ transposition (1.44) , 
F {>[>(P)]}} => (P) double negation (1.42) , 
F {[=(P)] > (Q)} > {[7(Q)] => (P)} derived rule (1.32). 


1.4.2 Guides for Proofs in the Propositional Calculus 


Besides the Deduction Theorem (theorem 1.22 ), an extension of the substitutivity 
of equivalence in the implicational calculus (theorem 1.29) also allows for the 
replacement of any occurrence of a formula by an equivalent formula containing 
negations, thanks to theorem 1.46 [18, p. 101, 124, 189], [108, p. 48]. 


1.46 Theorem (Substitutivity of Equivalence in the Pure Propositional Calcu- 
lus, preliminary version). For all well-formed propositional formulae U and V, if 
- (U) => (V) and (V) > (U), and if a formula Q results from substituting any 
(zero, one, several, or all) occurrence(s) of U by V in a well-formed formula P, then 


F (P) => (Q) andk (Q) = (P). 


Proof (Outline). Theorem 1.29 has already established the conclusions for 
implications. 

For negations, if P is —(U), then Q is either —(U) or —(V). If Q is —(V), then 
(P) > (Q) and (Q) = (P) become [-(U)] > [-(V)] and [-(V)] > U)], 
which hold by the law of contraposition (theorem 1.44 ): 


t (U) = (V) hypothesis , 
- [(U) > (V)] > {[-(V)] => [->(@)]}} _ theorem 1.44, 
F [7=(V)] > [-=(U)] Detachment; 
ae ae 
Q P 
F (V) > (UV) hypothesis , 
F [(V) > (U)] > {[-=(U)] = [F(V)]} _ theorem 1.44, 
F [-=(U)] > [-(V) ] Detachment; 
ee ee 


P Q 
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The general case follows by several applications of the previous case and the 
cases in theorem 1.29, in a way that may be specified more explicitly after the 
availability of the Principle of Mathematical Induction in chapter 4. Oo 


1.4.3 Proofs by Reductio ad Absurdum 


Within classical logic, a proposition and its negation together form an “absurdity” 
that cannot hold. In particular, if a hypothesis implies a conclusion and its 
negation — an absurdity — then the hypothesis may be rejected. The following 
theorems establish the validity of such a pattern of reasoning, called reduction to 
the absurd. 


1.47 Theorem (special law of reductio ad absurdum). For each well-formed 
formula P, the following formula is a theorem: {(P) => [=(P)]} > [-(P)]. 


Proof. Start with theorem 1.44 and the denial of the antecedent (theorem 1.40): 
F {(P) > [A(P)}} > (FI @)}} > [-@)) (1.44), 
F {-[-@)}} > (F@)1 = (12) = PB) (1.40), 
FF} = (FQ) = CIP) = OH] 

= (iF) = F))) = (FF) > 1) > PI] 2, 


F ({-[-(P)]}} > [-@))) > (-F@)}} = (10) => 3) (M.P.), 
F ({-[-(P)]}} > {-[) > ®B) > (P) = @l > FO} (P3), 

F {(P) > [-(P)} = 1?) = ©] = Fy} (1.19), 
Pe?) >) (1.14), 
F {(P) = [-(P)]}} > [-@)] (1.17). : 


1.48 Theorem (law of reductio ad absurdum). For all well-formed formulae P 
and Q, the following formula is a theorem: 


[(P) = Q)] > ((P) > FQ} > [-)). 


Proof. Apply theorems 1.44, 1.12, 1.37, 1.16, 1.47, 1.32: 


F [(P)=> (OQ) {I-(O]>[-()}} (1.44), 
+ (P)>((P)>Q1>{-(@Q1>[-()]}) (11d), 

+ [(P)>(Q)l>[PS{-OQl>F)}}] (1.37), 

+ [PS (-Ols FPO} > ({(HsF-O}S+{(P)>-)]}}) 2), 

F [(P)>=(Q)>({P)>FOliS(P)> (PB) (1.16), 
F{(P)=[-P)Sl-(@)] (1.47), 

F [(P)>(Q)]>({(P)>[-()]}} > ))) (1.32). oO 


Theorem 1.48 is the theoretical basis for the pattern of reasoning by reduction to 
the absurd: if (P) > (Q) and (P) > [=(Q)] are theorems, then theorem 1.48 and 
Detachment twice prove that =(P) is a theorem. As a special case of theorem 1.48, 
theorem 1.47 shows that if a statement P implies its negation —(P), then —(P) is a 
theorem. 
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1.4.4 Proofs by Cases 


Theorem 1.49 provides a variation on the theoretical foundation of proofs by cases: 
if, in the first case, a conclusion R follows from a hypothesis P, and if, in the 
second case, the same conclusion R also follows from the negation —(P) of the 
same hypothesis P, then R is a theorem, derivable from the axioms and inference 
tule, so that the hypothesis is superfluous. 


1.49 Theorem (proof by cases). For all well-formed formulae P and R, 
Pl, P2, P3,F {[>(@)] = (R)} = {[P) > (R)] > (®)}- 


Proof. Apply the laws of double negation (theorem 1.42), contraposition (theo- 
rem 1.44), and reduction to the absurd (theorem 1.48): 
F {-R))l= (> ((F@ls-O)pS+{-F-()]}) — theorem 1.48, 


F {>[7(R)]} = (R) double negation (1.42), 
F {I-R)]>P)}=> tI R))+[-P) 3 (R)] derived rule (1.32), 
F [(P)>(R)]>{[7R)]=S[-@P)]} transposition (1.44), 
F {[7(R)] > (P)}= [tLP)=> (RA) = (®)] derived rule (1.31), 
F {[>(P)]>(R}}>{[7(R)]=> (P)} theorem 1.45, 
F {[A(P)]=>(R)}=> {[(P)=>(R)]=> (R)} derived rule (1.16). 
oO 


Theorem 1.50 generalizes theorem 1.49 to the situation with an intermediate 
hypothesis [18, p. 205, footnote 355]. 


1.50 Theorem (proof by cases subject to hypotheses). For all well-formed for- 
mulae P, Q, and R, 
PI, P2, P3, {[>(P)] = (R)} = [[(Q) > (R)] > (1) >= (Q] > (®})]- 


Proof. Apply the law of proof by cases (theorem 1.49), the transitivity of implica- 
tion (theorem 1.27), and a derived rule (theorem 1.39): 


F {[-(P)] = (R)} = {[(P) > (R)] = (R)} theorem 1.49, 
— a 
H WwW 
F [(Q) > (R)] > {1P) > (Q)] > [(P) > (R)}} theorem 1.27, 
_—_—_ _—_—__ =—-_—__———” 
U Vv W 


+ {[=(P)] => (R)} = [[(Q) > (R)] > ({[(P) > ()] > (R)})] theorem 1.39. 
——_ — ——S_ ee 


H U V 
oO 
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1.4.5 Exercises on Frege’s and Church’s Axioms 

For the following four exercises, use the rules of inference and only the following 
six axioms, due to Frege [39]: 

Axiom F1 /39, p. 137, eq. (1)]: (P) => [(Q) => (P)]. 


Axiom F2 [39, p. 137, eg. (2)]: ((P) = ((Q) > (R)I} > {P) > Ql > 
[(P) > (®))}- 


Axiom F3 /39, p. 146, eq. (8)]: {(P) => [(Q) => (RA) = ((Q) = [(P) => (R)}. 
Axiom F4 [39, p. 154, eq. (28)]: [((P) > (Q)] > {[-(Q)] > [-@)]}. 


Axiom F5 /39, p. 156, eq. (31)]: {-[-(P)]} => (P). 

Axiom F6 /39, p. 158, eq. (41)]: (P) = {-[-(P)]}. 

1.37. Prove that F1-F6+ (P) > {[=(P)] > [F(Q)]}. 

1.38. Prove that F1—F6- [-=(P)] > {(P) [=(P)]}. 

1.39. Prove that F1-F6F {[-=(Q)] > [-(P)]} > [(P) > (Q)]. 


1.40. Prove that Frege’s six axioms FI-F6 are logically equivalent to 
Lukasiewicz’s three axioms P1, P2, P3. 


The following exercises outline a proof that the axioms P1, P2, P3 of classical 
logic are logically equivalent to the following three axioms Cl, C2, C3, used by 
Church [18, §10, p. 72] and Robbin [108, p. 14]. Because the two logical systems 
have the same first two axioms, they also have the same implicational calculus. The 
two logical systems differ from each other only by their third axiom, where F stands 
for False, so that =(F) is a theorem. 


Axiom C1 (P) > [(Q) > (P)]. 
Axiom C2 {(P) = [(Q) = (R)]} > {[(P) > (Q)] = [(P) > (A)}}- 
Axiom C3 {[(P) > (F)] > (F)} => (P). 


1.41. This exercise establishes the converse law of contraposition in Church’s 
system. Prove the tautology {[(B) > (F)] > [(A) => (F)]} => [(A) > (B)] within 
Church’s system, using only results from the implicational calculus (axioms C1 and 
C2) and axiom C3 (not axiom P3). Hint: start from axiom C2, with (B) => (F) 
for P, with A for Q, and with F for R. Then use the transitivity of implications. 


To show the equivalence of Church’s logical system and classical logic, the 
following exercise establishes the equivalence of =(P) and Church’s (P) => (F). 


1.42. Prove that [(P) > (F)] > [-(P)] and [-(P)] > [(P) > (F)] are theorems, 
using the theorem —(F’), and the classical axioms P1, P2, P3. 
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1.43. Define a “negation” ~ (P) to be an abbreviation for (P) > (F). Prove that 
every theorem of classical logic is a theorem of Church’s logic. 


1.44. Using the theorem —(F) and axioms P1, P2, P3, prove the theorems {[(P) > 
(F)] > (F)} = {--@)]} and {-F-(P)B => 1) = (PH) > ()}. 


1.45. Using axioms P1, P2, P3 and any of the classical theorems already proved, 
prove [=(P)] => [(P) = {-[(S) > (S)}}] and [(P) > {-[(S) > (S)}}] > FF). 


1.46. In classical logic define a constant f to be an abbreviation for —[(S) => (S)]. 
Prove that every theorem of Church’s logic is a theorem of classical logic. 


1.5 Other Connectives 


There exist logical connectives other than the negation (—) and implication (>), 
for example, the conjunction (A), disjunction (V), and equivalence (<>). Such other 
connectives can be specified by other axioms, for instance, by Tarski’s axioms IV— 
VI to specify the equivalence, as in example 1.87 on page 55. Alternatively, other 
connectives can be defined as abbreviations of longer formulae in terms of negations 
and implications, as presented in this section. 


1.5.1 Definitions of Other Connectives 


The logical connectives — and => suffice to define all the other logical connectives, 
for instance, the conjunction A, the disjunction V, and the equivalence <=, as 
outlined here. 


1.51 Definition (conjunction, disjunction, and equivalence). In the full Classi- 
cal Propositional Calculus, the connectives A, V, and < may be defined as follows: 
(P) \(Q) _ stands for ={(P) = [-(Q)]}: 
(P) Vv (Q) stands for [(P) = (Q)] = (Q); 
(P) < (Q) _ stands for [(P) = (Q)] A [(Q) = (P)]. 


1.5.2 Examples of Proofs of Theorems with Conjunctions 


The logical conjunction (A) can also be introduced into a logic by additional 
axioms, for instance, as in Hilbert’s Positive Propositional Calculus, Brouwer & 
Heyting’s Intuitionistic Logic and Kolmogorov & Johansson’s Minimal Logic [18, 
§26, p. 140-146]: 
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(Anp.1) F [(P) A (Q)] > (Q) 
(AND.2) F [(P) A (Q)] = (P) 
(AND.3) F (P) = {(Q) = [(P) A (Q)]} 
The following theorems reveal that within the Classical Propositional Calculus, 


such axioms also follow from definition 1.51. 
The first theorems show that if (P) A (Q) holds, then P holds and Q holds. 


1.52 Theorem (AND.1). The formula [(P) A (Q)] => (Q) is a theorem (schema). 


Proof. Apply transposition, double negation, transitivity, and the definition: 


F [=(Q)] > {(P) > [F(Q)]} axiom P1, 
FE (={ (P) => [=(0)]}) = {-[-(Q)]} contraposition (1.44) of Pl, and MP, 
F [(P) A (Q)] > {-|-(@)]}} definition 1.51 of A, 
- {=[-=(Q)]} = (Q) theorem 1.42, 
F [(P) A (Q)] = (Q) theorem 1.16. 
oO 
1.53 Theorem (AND.2). The formula [(P) \ (Q)] => (P) is a theorem. 
Proof. Apply theorems 1.40, 1.44, 1.42, and 1.16: 
F [=(P)] > [(P) > (Q)] denial of the antecedent (1.40), 
b (A{(P) > [-(Q)]}) > {-[-(P)]}_ contraposition (1.44) of 1.40, 
F [=-(P)] > (P) double negation (1.42), 
F [(P) A (Q)| => (P) transitivity (theorem 1.16). 
oO 


The next theorem demonstrates a “converse” to the preceding two theorems. so 
that if P and Q hold, then their conjunction (P) A (Q) also holds. 


1.54 Theorem (AND.3). If P and Q are both theorems, then (P) A (Q) is a theorem; 
equivalently, for all well-formed formulae P and Q, (P) => {(Q) => [(P) A (Q)]}. 


Proof. Apply the law of assertion, contraposition, and transitivity: 
+ (P) => ({(P) > [-(Q)}} = [-(Q)]) _ assertion (theorem 1.38), 


+ ({(P) > [-(Q)]}} > [-(Q)]) contraposition ... 
= {(Q) = (AP) = FO))} ... continued, 


 (P) > {(Q) > (-{(P) > [-(O)]})} _ theorem 1.16, 

F (P) > {(Q) > [(P) A QI] definition of A. 
Hence, if P and Q are both theorems, then (P) A (Q) is a theorem: 

F (P) => {(Q) > [(P) A (Q)]} just proved, 

F (P) hypothesis, 

- (Q) > [((P) A (Q)] Detachment, 
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F (Q) hypothesis, 
F (P) A(Q) Detachment. 


Theorem 1.54 allows for the following derived rule of inference. 


1.55 Theorem (derived rule). For all well-formed formulae H, K, and L, if (H) > 
(K) and (H) => (L) are theorems, then (H) = [(K) A (L)] is a theorem: 


F (A) = (1K) > (() > (> (A) = [(K) A D)}): 


conversely, if (H) => [(K) A (L)] is a theorem, then (H) => (K) and (H) => (L) are 
theorems. 
Proof. Apply theorem 1.54 and transitivity (theorem 1.32) with M for (K) A (L): 
F [(A) > (kK) = (2) = (M)]}] 
([(H) > (K)] > (A) = [Z) => ()}}) axiom P2, 


F(A) = [Y= WI} 
(= DO) = (A) => W)} axiom P2, 


+ [(H) = {(K) > [(L) > (M)}}] 
((H) = (K)] > {[(4) > (D] = (A) > ()]}) theorem 1.32, 


F (K) => [(L) > (Y)] theorem 1.54, 

F (A) = (kK) = [(Z) = ()}} theorem 1.12, 

F (A) = (1) = tl) = ©) = (A) = ))} Detachment, 
with M for (K) A (L). The converse results from theorems 1.52 and 1.12: 

F [(K) A (L)] = (L) theorem 1.52, 

F (A) = {[(K) A (L)] > ()} theorem 1.12, 

F {(H) = [(Q) = (R)]} > tA) => (Q)] = [(A) = (R)]}— axiom P2, 

F {(H) = [(K) A (L)]} = [(A) = (D)] Detachment. 


Replacing [(K)A(L)] => (ZL) (theorem 1.52) by [(K)A(L)] => (K) (theorem 1.53) 
in the foregoing proof yields a proof of {(H) > [(K) A (L)]} > [(A) > (K)]. Oo 


For instance, theorem 1.55 yields the following theorem. 


1.56 Theorem (idempotency of A). For each well-formed formula P, the formulae 
(P) = [(P) A (P)] and [(P) A (P)] => (P) are theorems. 


Proof. Substitute P for each of H, K, L in theorem 1.55: 
+ [(H) => (K)] > ([((A) > (L)] > {(4) => [(K) A (L)]}) _ theorem 1.55, 
F [(P) > (P)] > ([(P) > (P)] > {(P) > [(P) A (P)}) substitutions, 
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F (P) = (P) theorem 1.14, 
F [(P) => (P)] > {(P) => [(P) A (P)]} Detachment, 
F (P) => [(P) A (P)] Detachment. 
The converse implication is a substitution in theorem 1.53. oO 


The next theorem shows that the conjunction A commutes. 


1.57 Theorem (commutativity of A). For all well-formed formulae P and Q, the 
formula [(P) A (Q)| = [(Q) A (P)] is a theorem. 


Proof. Apply theorems 1.52, 1.53, and 1.55: 
F [(P) A (Q)] = (Q) theorem 1.52, 
F [(P) A (Q)] > (P) theorem 1.53, 
F [(P) A (Q)] = [(Q) A (P)] _ theorem 1.55. 
Swapping the roles of P and Q yields the converse: [(Q) A (P)] = [(P) A (Q)]. 
Oo 


1.58 Theorem (derived rule). Jf | [(H) A (L)] = (N), then + (A) => 
[(L) > (N)]. 


Proof. Apply theorems 1.54 and 1.32. 
F (AH) > {(L) => [(A) A (L)]} _ theorem 1.54, 


F [(H) A (L)] => () hypothesis, 
F (H) => [(L) => (N)] theorem 1.32. 
oO 
1.59 Theorem (derived rule). [f+ (H) => [(L) > (N)], then [(A) A (L)] 
=> (N). 
Proof. Apply theorems 1.54 and 1.32. 
1 F (A) A (L)] > (A) theorem 1.53, 
2 F(A) => [L) > )] hypothesis, 
3 F [(A) A (L)] > (2) => ()] lines 1, 2, theorem 1.16; 
4 F [(A) A (L)] > ©) theorem 1.52, 
5 - [(A) A (L)] > {[((4) A (L)] > (N)} lines 3, 4, theorem 1.31, 
6 F [(H) A (L)] > ) theorem 1.25. 
oO 


1.5.3 Examples of Proofs of Theorems with Equivalences 


The logical equivalence (<>) can also be introduced into a logic by additional 
axioms, for instance, Hilbert’s Positive Propositional Calculus, Brouwer & Heyt- 
ing’s Intuitionistic Logic and Kolmogorov & Johansson’s Minimal Logic [18, §26, 
p. 140-146], by Tarski’s axioms IV, V, and VI [129, p. 147]: 


(EQ.1) Tarski’s Axiom IV: F [(P) = (Q)] = [(P) => (Q)] 
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(EQ.2) Tarski’s Axiom V: F [(P) = (Q)] > [(Q) > (P)] 
(EQ.3) Tarski’s Axiom VI: F [(P) = (Q)] = {[(Q) = (P)] = [(P) + (Q)]} 


The following theorems reveal that within the Classical Propositional Calculus, 
such axioms also follow from definition 1.51. 
Combining theorem 1.54 with the definition of + gives Tarksi’s Axiom VI. 


1.60 Theorem (Tarski’s axiom VI). For all well-formed formulae U and V, 
Tarski’s axioms VI is a theorem: - [(U) (V)] {(vV) => UY)] => (OY) = 
(V)]}. 


Proof. Combine theorem 1.54 with the definition of =: 


(U)(V) 
TE 
[(U) > (V)] > (IY) >= ©] > (O) > VIA [(V) > ©)}). 
—-—$<— — 4 —— SO 
P Q P Q 


oO 


A particular instance of the foregoing theorem allows for any proof of any 
equivalence (1) <= (J) to be split into two separate proofs of (1) => (J) and 
VJ) => W). 


1.61 Theorem (derived rule). [f (J) > (J) and (J) => (J) are theorems, then so 
is 1) & (J). Conversely, if (1) <> (J) is a theorem, then so are (I) => (J) and 
VJ) => . 


Proof. Apply theorem 1.54 with (1) => (J) for P, and with (J) > (J) for Q, so that 
if (7) > (J) and (J) => (J) are theorems, then [(7) > (J)] A [J) => (D] is also a 
theorem, which is (J) + (J) by definition, and conversely. Oo 


Theorem 1.62 discharges the hypothesis (J) < (J) in theorem 1.61, which yields 
Tarski’s axioms IV and V. 


1.62 Theorem (Tarski’s axioms IV and V). For all well-formed formulae I and 
J, Tarski’s axioms IV (I) <> (J)] > [2 => V)| and Tarski’s axioms V [(I) @ 
(J)] > [V) => (D] ave theorems. 


Proof. By definition 1.51, the formula (1) < (J) is an abbreviation for [7) > 
(J)] A [(J) = (D]. The present theorem results from substitutions in theorems 1.52 
and 1.53. oO 


Hence the following theorem establishes the reflexivity of equivalence. 
1.63 Theorem (reflexivity of <>). For each well-formed formula P, + (P) <= (P). 


Proof. Apply the definition of < and the reflexivity of > (theorem 1.14) with 


theorem 1.61: 
- (P) => (P) theorem 1.14, 


- (P) => (P) theorem 1.14, 
- (P) = (P) theorem 1.61. 
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The following theorem establishes the symmetry of equivalence. 


1.64 Theorem (symmetry of <=). For all well-formed formulae H and K, if 
(H)  (K) is a theorem, then so is (K) + (H). 


Proof. Apply the definition of < and the commutativity of A (theorem 1.57): 
F (A) @ (kK) hypothesis, 
- [(H) => (K)] A [(K) > (A)] definition of }, 
F [(K) = (A) A [(A) > (K)] commutativity of A and Detachment, 
F (K) @ (A) definition of ©. 


Similarly, the following theorem establishes the transitivity of equivalence. 


1.65 Theorem (transitivity of <=). For all well-formed formulae H, K, and L, if 
(H) = (K) and (K) © (L) are theorems, then (H) <> (L) is a theorem. 


Proof. Apply the definition of < and theorems 1.61 and 1.16: 

+ (H) } (K) _ hypothesis, 

+ (H) => (K)] theorem 1.61, 

F (K) @ (L) _ hypothesis, 

F (K) => (L)| _ theorem 1.61, 

+ (H) => (L) _ theorem 1.16, 

By symmetry (theorem 1.64) (K) < (A) and (L)  (K) are also theorems, 

whence (L) = (A) is a theorem. From (H) => (L) and (L) => (A) it then follows 
that (H) + (L) is a theorem. Oo 


The following theorem demonstrates the use of the transitivity of equivalence in 
the proof that the conjunction A is associative. 


1.66 Theorem (associativity of A). For all well-formed formulae P, Q, and R, 


F {(P) A [(Q) A (R)]F => t1(P) A (Q)] A (R)}- 


Proof. Apply the law of commutation (theorem 1.37) and theorem 1.55: 
EF [(P) => {(R) > [-(2)]} | twice theorem 1.37... 
<= [(R) = {(P) > [-(Q)]}] __ ... and theorem 1.55, 
¢ contrapositions, 
F {-[(P) > {@) = [-@)B]} 
# {-[R@) > {(P) = FOBT 
¢ = definition of A, 
F {-[(P) > {-[(2) 4 @L]} 
& {-[(R) > (1) a O)]}]} 


{= definition of A, 
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F {(P) A [(Q) 4 (A) 
@ UR) ALP) A (Q 
¢ commutativity of A. 
F {(P) A [(Q) 4 (A) 
= {[(P) A (Q)] A (R)} 


1.5.4 Examples of Proofs of Theorems with Disjunctions 


The logical disjunction (Vv) can also be introduced into a logic by additional 
axioms, for instance, as in Hilbert’s Positive Propositional Calculus, Brouwer & 
Heyting’s Intuitionistic Logic and Kolmogorov & Johansson’s Minimal Logic [18, 
§26, p. 140-146]: 


(OR.1) F (P) = [(P) v Q)] 
(OR.2) F (Q) = [(P) Vv (Q)] 
(OR.3) F [(P) = (R)] = ([(Q) = (R)] = {[(P) v (Q)] = (R)}) 


The following theorems reveal that within the Classical Propositional Calculus, 
the first two axioms also follow from definition 1.51. The third axiom is derived in 
theorem 1.78. 


1.67 Theorem (OR.1, OR.2). For all well-formed formulae P and Q, the formulae 
(Q) => [(P) v (Q)] and (P) => [(P) Vv (Q)] are theorems. 


Proof. For the first formula, apply axiom P1 and definition of v: 

F (Q) = {[(P) > (Q)] > (Q)} axiom PI, 

F (Q) > [(P) v (Q)} definition of Vv. 

For the second formula, apply theorem 1.14, the Law of Commutation, and the 

definitions: 

F [(P) > (Q)] = [P) > (Q)] _ theorem 1.14, 

F (P) > {[(P) > (Q)] = (Q)} Law of Commutation (theorem 1.37), 

F (P) = [(P) v (Q)] definition 1.51 of v. 


The next theorem shows that the disjunction V is idempotent: 


1.68 Theorem (idempotency of V). For each well-formed formula Q, 


F (Q) > [(Q) v (Q)I. 


Proof. Substituting Q for P in theorem 1.67 yields | (Q) = [(Q) v (Q)]. 
For the converse, apply theorem 1.14, the Law of Assertion (theorem 1.38), and 
Detachment: 


www.pdfgrip.com 
1.5 Other Connectives 45 
F [(Q) > (Q)] theorem 1.14, 
— ———— 


H 
F [(Q) > (Q)] > ({[@ > (@] => (@)}=> (Q) ) _ theorem 1.38, 
——_-_—_—_—_———” —--_——"” ——” ——” 


H H Cc Cc 
F {[(Q) = (Q)] (Q) }=> (@) Detachment, 
—_e-" ——” —_—— 
H Cc Cc 
F [(Q) v (Q)] > (Q) definition of v. 
Hence the equivalence + (Q) + [(Q) Vv (Q)] results from theorem 1.61. Oo 


Theorem 1.69 provides an alternative definition of the disjunction in the Classical 
Propositional Calculus. 


1.69 Theorem. For all well-formed formulae P and Q, the following formulae are 
theorems: 


F {[-)] > (Q)} > {P) > @)] > (Q}. 
F {[(P) > (Q)] = (Q)} > t-P)] > (Q)}. 


Proof. The first formula is a substitution in the proof by cases (theorem 1.49). 
The second formula results from the law of denial of the antecedent (theo- 
rem 1.40) and a derived rule (theorem 1.33): 


F [=(P)] => [(P) (Q)] theorem 1.40, 
—_—— —__—-_ 


I A 
Fl) + (Ol (2 _)} > {E@)] > (_@_)} theorem 1.33. 
in an a ar 2 


The next theorem establishes the law of excluded middle in classical logic. 


1.70 Theorem (law of excluded middle). For each well-formed formula B, the 
formula (B) Vv [>(B)] is a theorem. 


Proof. Apply theorems 1.14 and 1.69, and the definition of the disjunction (Vv): 
F [=(B)] > [-(B)] theorem 1.14, 
F (B) v [7(B)] theorem 1.69 and definition of V with — and >. 


The following theorem shows that the disjunction V is associative: 


1.71 Theorem (associativity of V). For all well-formed formulae P, Q, and R, 


F {(P) v [(Q) v (R)]F > t1(P) v (Q)] Vv (R)}. 
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Proof. Apply the law of commutation (theorem 1.37) and theorem 1.55: 
F ([-(P)] > {[-(R)] > (Q)}) twice theorem 1.37... 
<> ([-(R)] > {[-(P)] > (Q)}) _ ... and theorem 1.55, 
¢ definition of v, 
K{(P)V[(R) VO} & {A VIP) VO} 
¢ commutativity of Vv, etc. 


F {(P) v [(Q) v (R)F > t(P) v (Q)] v (R)} 


1.5.5 Examples of Proofs with Conjunctions and Disjunctions 


The following theorems establish De Morgan’s laws in classical logic. 


1.72 Theorem (De Morgan’s first law). For all well-formed formulae P and Q, 
F {-[(P) AO) + {[-@)] v >}. 


Proof. Apply the definitions of A and v and double negations: 
=[(P) 4 (Q)] 

{= definition of A, 

[-(-4) = FO)})] 
¢ double negation (theorems 1.42 and 1.43), 
(P) > [-@)] 
¢ double negation (theorems 1.42 and 1.43), 
{-I-@)k > [-@)] 
¢ definition of V by theorem 1.69. 


[-(P)] Vv [-@)]. 


oO 


1.73 Theorem (De Morgan’s second law). For all well-formed formulae P and Q, 
F {-[(P) v OG > {1-()] 4 [->@Q)}- 


Proof. Apply the definitions of A and Vv and double negations: 
—[(P) v (Q)] 
¢ definition of V by theorem 1.69, 
“t[-@)] > (Q)} 
{double negation (theorems 1.42 and 1.43), 
A(-@)] + 1-@)}) 


{definition of A. 


[-(P)] 4 [-(@)]. 


The following theorem shows that disjunctions distribute over conjunctions. 


www.pdfgrip.com 


1.5 Other Connectives 47 


1.74 Theorem (distributivity of V over A). For all well-formed formulae P, Q, 
and R, 


F {(P) v [(Q) A (A) > tL(P) v (QI A TP) v (R)- 


Proof. Apply the definition of V with De Morgan’s laws and theorem 1.55: 
[(P) v (Q)] A [() v (R)] 
¢ definition of V by theorem 1.69, 
7@P)] > (Q)} A t-P)] = (R)} 
{theorem 1.55, 
[=(P)] = [(Q) 4 (8)] 
¢ definition of V by theorem 1.69. 
(P) v [(Q) A (R)] 


The following theorem shows that conjunctions distribute over disjunctions. 


1.75 Theorem (Distributivity of \ over V). The following formula is a theorem: 
{(P) A [(Q) v (RA) > t1(P) A (Q)] v [(P) 4 (AE. 
Proof. Apply the definition of A and theorem 1.55: 
[(P) A (Q)] v [(P) 4 (R)] 


definition of A, 


? 


(={(P) = [-(Q)]}) v (FKP) > [FR 


— 


De Morgan’s first law, 


aa 


=({P) = FOB ACP) > F@)}) 
{theorem 1.55, 
“[P) = {FOIA FRB] 
¢ De Morgan’s second law, 
-[) > {-1@ v ()]}] 
{definition of A. 
(P) A [(Q) v (R)] 


1.5.6 Exercises on Other Connectives 


For the following exercises, prove that the stated formulae are theorems (schemas), 
using the classical propositional calculus and any of the results just proved. 


1.47. {[(P) > (Q)] = (P)} > () 
1.48. [(P) = (R)] > [{[(P) > (Q)] > (R)} > ®)] 


www.pdfgrip.com 


48 1 Propositional Logic: Proofs from Axioms and Inference Rules 


1.49. {[(P) > (Q)] > (R)} = ([(®) => (P)] > (P)} 
1.50. {[(P) > (Q)] > (R)} = t[P) = (R)] = (R)} 
1.51. (P) > ([-(Q)] > {-[) > (Q)}}) 

1.52. [-(P)] > tl) 4 QI} 

1.53. (P) > {[(P) > (Q)] > (P)} 

1.54. [-(P)] > ([-()] > (1) v I) 

1.55. [(P) > (Q)] > (-{(P) A [-)B) 

1.56. [(P) > (Q)] > tI-)] v (Q)} 


1.6 Patterns of Deduction with Other Connectives 


The preceding sections have demonstrated the theoretical foundations for patterns of 
deduction in the implicational calculus, for example, the transitivity of implication 
and the law of commutation, and with contraposition, for example, the law of 
reductio ad absurdum. Similarly, this section presents patterns of deduction with 
conjunctions and disjunctions, for instance, proofs by cases or by contradiction. 


1.6.1 Conjunctions of Implications 
The first theorem shows yet another form of the transitivity of the logical implica- 


tion. 


1.76 Theorem. For all well-formed formulae P, Q, and R, 


F {[(P) > QA 1@Q) = (Ri > [(P) > (®)]. 


Proof. Apply theorems 1.52, 1.53, 1.27, and 1.21: 
F {[P) > QA [(Q) = (R)]} > (Q) = (R)] theorem 1.52, 
SS —— ee 


K 


el 
F {[P) > (QA [(Q) = (R)]} = (P) > (Q)] theorem 1.53, 


H L 
F [(Q) > (R)] > 1) = OI [(P) = ®@)} theorem 1.27, 


K L M 
F {[(P) = (Q)] A [(Q) = (R)]} = [(P) = (R)] theorem 1.21. 
—$—< — << Ste mene 
H M 
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The following theorems give derived rules of inference with conjunctions of 
implications. The first theorem shows that if either of two hypotheses leads to 
a conclusion, then the disjunction of both hypotheses also leads to the same 
conclusion. 


1.77 Theorem. [ft (U) => (W) and+ (V) = (W), then [(U) v (V)] > (W): 
F {[(U) => (W)] ATV) = (W)]}} => {[(Y) v (V)] = (W)}. 


Proof. The first part of the proof assumes the two hypotheses. 


F (U) > (W) hypothesis, 
F [=(W)] = [A(U)] contraposition and Detachment, 
F (V) > (W) hypothesis, 
F [=(W)] = [FC(V)] contraposition and Detachment, 


F [=(W)] = {[-=(U)] A [A(V)]} theorem 1.55, 
F [-=(W)] > {-[T) v (V)}} De Morgan’s second law, 


F [(U) v (V)] > (W) axiom P3 and Detachment. 

The second part of the proof dispenses with the two hypotheses. 
F {[(U)>(W)] A [(Y)>(W) JS [> (W)] theorem 1.53, 
F [(U)=>(W)]>{[->(W)]>[-@)]} transposition, 
FOS W)]A(M=W)BStl>W)|S [>I theorem 1.16; 
F {(H=>(W)] A (Y=(W) } > (VY) > (W)] theorem 1.53, 
F[(Y=W)| >t }W]=FFM)} transposition, 
FTO=W)) A(M=W)BStr>WM1s FM) theorem 1.16; 


F UY) (W)] A (YW) 
= (FM )= FOR At-MI=F-()}) theorem 1.55; 


F ({[-(W)]>[-(U)} A {[-(W) 1 [-(V)]}) 
=> ([7(W)]> {I-()] A [->() 3) theorem 1.55; 


F (FMS tO) A FM) [Ov V) i= (W)] — contraposition; 
F [(U)>(W)] A (VY) > (WS [[U) v (V)]}}=>(W)] theorem 1.55. : 


Theorem 1.78 shows that an intuitionistic and minimalist axiom for the disjunc- 
tion is a theorem in the Classical Propositional Calculus. 


1.78 Theorem (OR.3). For all propositional forms P, Q, and R, 


F [(P) = (R)] = ([(Q) = (R)] => {[(P) v (Q)] > (8)}). 
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Proof. Apply theorems 1.58 and 1.77: 
F {(P) > (RA [(Q) > (Ri = [{@) v (Q)}} = (R)] theorem 1.77, 
—_——S ee 


A L N 
+ [(P) = (R)] = ([(Q) > (R)] = {{[(P) v (Q)]} = (R)}) _ theorem 1.58. 
ee. —“——" —— ——— 


H L N 
oO 


The next theorem shows that the disjunction V commutes: 


1.79 Theorem (commutativity of V). For all well-formed formulae P and Q, the 
formula [(P) Vv (Q)| => [(Q) v (P)] is a theorem. 


Proof. Substitute (Q) Vv (P) for R in theorem 1.78 and apply theorem 1.67 with 
Detachment: 


F {(P)=[(Q) v (P= ((Q)=[Q) v (P)]} 


= {[(P) v (Q)]=[(@) v (P)]}) theorem 1.78 , 
F (P)=[(Q) v (P)] theorem 1.67, 
F {(Q)=[(Q) v (P)} =P) Vv (QQ) v (P)]}_ Detachment , 
F (Q)=>[(Q) Vv (P)] theorem 1.67, , 
F [(P) v (Q)]=[(Q) v (P)] Detachment. 
The reverse implication results from swapping the rdles of P and Q. oO 


Similarly, the second theorem shows that if either of two hypotheses leads to 
a conclusion, then the conjunction of both hypotheses also leads to the same 
conclusion. 


1.80 Theorem. [f+ (U) = (W) and (V) = (W), thenk [(U) A (V)] > (W): 
F {[(U) => (W)] AV) => (W)]} => {[(Y) A (V)] => (W)}. 


Proof. This proof relies on theorems 1.77, 1.53, 1.52, 1.67, 1.31: 
F {[(U)=>(W)] A [(V) => (W)]} 
———— 


={[U) v WW) theorem 1.77, 
M 
F [(U) A (VyI>1U) Vv (V) ESS, 1.52, 67, 
a se —— 


K L 
F {(U)>(W)] A [YS (W) RS {[V) A (V)]=C W)} theorem 1.31. 


H K M 
oO 


The third theorem shows that if one hypothesis leads to either of two conclusions, 
then the same hypothesis also leads to the disjunction of both conclusions. 


1.81 Theorem. For all well-formed formulae P, Q, and S, 


F {[(P) > QI] v [(P) > (S)I} & (P) = [(@Q) v (S)}}- 
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Proof. This proof uses contraposition and previously established equivalences. 


—t(P) = [(Q) v (S)]} 


(P) A {-[Q) v (S)] 


e=~ 


> 


(P) A I-@I] A [-G)] 


e~ 


a 


[(P) A (PI A t>@)] A [-6)] 


e~ 


> 


4 


CP) A [TOBE ACP) A [AB 


a 


(FP) = OP AIP) = SI 


o 


tl) = (Q)] v [(P) = (S)] 


aH 


definition of A and equivalences, 

De Morgan’s second law, 
idempotence of /, 

associativity and commutativity of A, 
definition of A, 


De Morgan’s first law. 


oO 


The fourth theorem shows that if two implications hold, then the conjunction of 
their hypotheses leads to the conjunction of their conclusions. 


1.82 Theorem. /ft (P) => (Q) andi (R) => (S), then  [(P)A(R)] > [(Q)A(S)]: 


F {[(P) > (QA ((R) = (S)I} > 1) A (R)] > [(Q) A DB. 


Proof. The proof relies repeatedly on theorem 1.53: 


F [(P) A (R)] = (P) 


F {[(P) A (R)] > (P)} 


([(P) > (Q)] = {[(P) 4 (R)] > (Q)}) 
F [(P) => (Q)] > {1(P) A (R)] > (Q)} 


theorem 1.53, 


theorem 1.28, 
Detachment; 


F {[(P) > (Q)] A [(R) = (S)]} 
H 


+ [(P) A (R)] > (R) 
F {[(P) A (R)] > (R)} 
([(R) = (S)] 


[(P) > (Q)] 
F {P) = OIATR) = S)i > tlP) A(R) > Qs 


{[(P) A (R)] > (9)}) 
F [(R) => (S)] > {[(P) A (2)] = (5)} 


theorem 1.53, 
theorem 1.15; 


K 


theorem 1.53, 


theorem 1.28, 
Modus Ponens; 


F {[(P) > (Q)] A [(R) = (S)]} 


[(R) = (S)] 


theorem 1.53, 


FP) >= @IAIA) = OS) > IP) 4 1 = (S)} 


H 


theorem 1.15; 


L 
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F {[(P) > (Q)] 4 [(R) = (S)]} 


H 
=> ({[P) A (R)] > (Q)} A {[(P) A (R)] = (S)}) _ theorem 1.55; 
_—_— —=—_—— 
K L 


F ({[(P) A (R)] => (Q)}- A {[(P) A (R)] > (S)}) 
‘eri, stro Ne sore 


H H 
=> {[(P) A (R)] > [(Q) A (S)]} theorem 1.55; 
———— 


H 
F {[(P) > (Q)] A [(R) = (S)]} 


=> {[(P) A (R)] > [(Q) A (S)]} theorem 1.15. 
Oo 


Similarly, the fifth theorem shows that if two implications hold, then the 
conjunction of their hypotheses leads to the disjunction of their conclusions. 


1.83 Theorem. [f+ (P) => (Q) andl (R) = (S), thenk [(P)A(R)] > [(Q)V(S)]: 
F {[(P) = (Q)] A [(R) = (S)]} = t[(P) A (R)] = [(Q) v (S)]}- 


Proof. This proof relies on theorem 1.82: 
F LPS Ql AMR= SSP) A WIS (QA (SP 1.82, 
—————— ———— — ——— 


H L M 
F [(Q) A (S)]>[(Q) v (S)] 133, 1.32, 1.67, 
——— ———— 


M N 
F {[(P)=>(Q)] A [(R)=> (S)]}= {(P) A (R= [(Q) Vv (S)}} 1.32. 
—_ee=-'—[V[—"'1-“—"——"’ —~" —— 


H L N 
1.84 Theorem. For all well-formed formulae P, Q, and R, 


F {[(P) => (Q)] A (R)} = {(P) = [(Q) 0 (RA). 


Proof. Apply theorems 1.14, 1.82 and 1.56: 
F [(P)=>(Q)]=[P)>(Q)] theorem 1.14, 
F (R)=>[(P)=>(R)] axiom P1, 
F {[(P)=>(Q)] A (R= t(P)=(Q)] A [(P)=>(R)]}_— theorem 1.82, 
F {[(P)+(Q)] A [((P)>(R)]}} > ((P)}[(Q) A (R)] theorems 1.82 and 1.56. 
oO 
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1.6.2 Proofs by Cases or by Contradiction 


The following theorems establish further derived rules of inference. The first 
theorem forms a part of the basis for an algorithm — called the Completeness 
Theorem — to design proofs within the propositional calculus, as explained in 
section 1.9. 


1.85 Theorem (proof by cases). Jf (H) => (R) and [=(A)| => (R) are theorems, 
then R is a theorem. Hence the following formula is also a theorem (schema): 


([(Z) > (R)] A {> @)] = (8)}) > ®). 


Proof. Apply theorem 1.77 and the law of excluded middle (theorem 1.70): 


F (H) = (R) hypothesis, 
F [-(A)] = (R) hypothesis, 
F {(H) Vv [-=(A)]} = (R) theorem 1.77, 
F (A) v [-(4)] theorem 1.70, 
rR Detachment; 


+ ([(H) = (R)] A {[-()] = (R)}) = [AD v [-()]} > (R)] theorem 1.77, 


F (A) v [-()] theorem 1.70, 
 ([(H) = (R)] A {[-(4)] = (8)}) > (8) theorem 1.17. 
oO 


Theorem 1.85 is the theoretical basis for proofs by cases, in the sense that if a 
conclusion R is a necessary consequence of a case H, and if R is also a necessary 
consequence of all the other cases, lumped together into —=(H), then R is a theorem. 

The second theorem establishes the classical principle of proof by contradiction. 


1.86 Theorem (proof by contradiction). Jf [=(R)] = (S) and [=(R)] > [7(S)] 
are theorems, then R is a theorem: for all well-formed formulae R and S, 


F ({[>(R)] = (S)} A -R)] > F-O)]}) = (8). 


Proof. Apply theorem 1.82, De Morgan’s first law, and contraposition: 
F [>(R)]=(S) hypothesis, 
F [>(R)]=[-()] hypothesis, 


F ({-R)>(5)} A -@)]>[()h) 
= ({17(R)] A [7(R) = (8) A [-(S)]}) theorem 1.82, 


F [A(R t1(8)] A [-(R)]} theorem 1.56, 


F ({[-(R)] > (S)} A {[-(R)] > [-(5)]}) 
=> ([-(R)]>{(S) A [=(S)]}) theorem 1.31, 
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F (TAR) {(S) A (SI) > [(F0S) A F-O})S{-F(R)}}]  contraposition, 
F (ARIS) A FS) SEO] v (S)}> @®)] De Morgan, 

F [-(S)] v (S) theorem 1.70, 
F (I-15) A FS) + @) theorem 1.17, 
F ({[-(R)]>(S)} A {-R)1>[-(5)]}) > (R) theorem ia 


Theorem 1.86 is the theoretical basis for proofs by contradiction: if a conclusion 
S and its negation —(S) are both necessary consequences of the negation —(R) of a 
statement R, then R is a theorem. 


1.6.3. Exercises on Patterns of Deduction 


1.57. Determine whether the following derived rule holds. “If (V) = (W) and 
(R) = (S) are theorems, then [(V) Vv (R)] = [(W) v (S)] is also a theorem”: 


{[(V) = (W)] A [(R) => (SE => LY) v (R)] => [(W) v (SI. 


1.58. Determine whether the following derived rule holds. “If (1) = (R) and 
(I) = [A(S)] are both theorems, then —[(R) = (S)] is also a theorem”: 


(ID > @IAD > (S)B) > (71) > SI}. 


1.59. Determine whether the following derived rule holds. “If (U) = (W) or 
(V) => (W) is a theorem, then [(U) Vv (V)] = (W) is also a theorem”: 


lO = MIvlY) = WE = lv VY) > W)}. 


1.60. Determine whether the following derived rule holds. “If (P) = (Q) or 
(R) => (S) is a theorem, then [(P) Vv (R)] => [(Q) v (S)] is also a theorem”: 


tP) = Ql] v (R) > (SE = IP) v (®R)] = [(@) v CS). 


1.61. Determine whether the following derived rule holds. “If (P) = (S) and 
[-(Q)] => [-(S)] is a theorem, then (P) > (Q) is also a theorem”: 


(IP) > SIA >] > [F(9)B) > [P) > I. 


1.62. Prove that if T is a theorem, then (R) => [(T) => (R)] is a theorem. 
1.63. Prove that if T is a theorem, then [(7) = (R)] = (R) is a theorem. 
1.64. Prove that if =(F) is a theorem, then so is {[=(R)] = (F)} => (R). 
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1.65. Prove that if =(F) is a theorem, then so is (R) = {[=(R)] > (F)}. 


1.66. Determine whether the following derived rule of inference holds. “If =(F) 
and {(P) A [=(Q)]} = (F) are theorems, then (P) = (Q) is also a theorem.” 


1.6.4 Equivalent Classical Axiom Systems 


The Pure Positive Implicational Propositional Calculus just presented rests on 
the rules of Detachment and Substitution with axioms Pl, P2, and P3. Other 
equivalent selections of axioms exist, for instance, Tarksi’s axioms J—-VII listed in 
example 1.87. 


1.87 Example (Tarski’s Axioms). Alfred Tarski lists seven axioms [129, p. 147]: 
Tarski’s Axiom I. (P) > [(Q) => (P)]. 
Tarski’s Axiom II. {(P) [(P) (Q)]} > [(P) > (Q)]. 
Tarski’s Axiom III. = [(P) > (Q)] > {[(Q) (R)] => [(P) (R)]}. 
Tarski’s Axiom IV. [(P) } (Q)] > [(P) > (Q)]. 
Tarski’s Axiom V. [(P) + (Q)] > (2) => (PI. 
Tarski’s Axiom VI. [(P) => (Q)] > {[(Q) > (P)] > [(P) + (O)]}. 
Tarski’s Axiom VII. {[-(Q)] > [-(P)]}} > [(P) > (Q)]. 


The exercises show that Tarksi’s axioms I-VII are equivalent to axioms P1, 
P2, and P3. Other equivalent selections of axioms include Kleene’s, listed in 
example 1.88 [72, p. 15]. 


1.88 Example (Kleene’s Axioms). Kleene lists four axioms [72, p. 15]: 
Kleene’s axiom la. (A) => [(B) > (A)]. 
Kleene’s axiom 1b. [(A) > (B)] > ({(A) > [B) > (C)]} = [(A) > (O))). 
Kleene’s axiom 7. —_[(A) > (B)] > ({(A) > [-(B)]} = [-(A4))). 
Kleene’s axiom 8. {-[-=-(A)]} => (A). 


The exercises also show that Kleene’s axioms are derivable from axioms P1, P2, 
and P3. Yet other equivalent selections of axioms include John Barkley Rosser’s, 
listed in example 1.89 [110, p. 55]: 


1.89 Example (Rosser’s Axioms). Rosser lists three axioms [110, p. 55]: 
Axiom R1 (P) => [(P) A (P)]. 

Axiom R2 [(P) A (Q)] => (P). 

Axiom R3 [(P) = (Q)] => ({-[(Q) A (R)]} => {-[(R) A (P)]})- 


The exercises show that Rosser’s axioms are derivable from axioms P1, P2, and 
P3. Rosser’s reference [110] establishes the converse. 
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1.6.5 Exercises on Kleene’s, Rosser’s, and Tarski’s Axioms 


For the following exercises, define V, A, and < as in definition 1.51. 


1.67. Prove that Tarski’s axiom IV, [(P) = (Q)] => [(P) > (Q)], is derivable 
from axioms P1, P2, and P3. 


1.68 . Prove that Tarski’s axiom V, [(P) <> (Q)] => [(Q) => (P)], is derivable from 
axioms P1, P2, and P3. 


1.69. Prove that Tarski’s axiom VI, [((P) > (Q)] > {(@) => ()] = [(P) + 
(Q)]}, is derivable from axioms P1, P2, and P3. 


1.70. Prove that every theorem derivable from Tarski’s axioms I- VII is also 
derivable from axioms P1, P2, and P3. 


1.71. Prove that every theorem derivable from axioms Pl, P2, and P3 is also 
derivable from Tarski’s axioms I— VII. 


1.72. Prove that Rosser’s axiom Rl, (P) => [(P) A (P)], is derivable from 
axioms P1, P2, and P3. 


1.73 . Prove that Rosser’s axiom R3, [(P) > (Q)] => ({-[(Q) A (R)]}} > {-[(R)A 
(P)]}), is derivable from axioms P1, P2, and P3. 


1.74. Prove that Rosser’s axiom R2, [(P) A (Q)]| = (P), is derivable from 
axioms P1, P2, and P3. 


1.75 . Prove that Kleene’s axiom 7, [(A) = (B)] = ({(A) > [-(B)]} > [-(A)]), 
is derivable from axioms P1, P2, and P3. 


1.76. Prove that Kleene’s axiom 8 {—=[-(A)]} = (A), is derivable from axioms P1, 
P2, and P3. 


1.7 Completeness, Decidability, Independence, Provability, 
and Soundness 


This section addresses the question how to determine whether a formula admits of 
a proof from selected axioms, and, if so, how to find such a proof. The main tool to 
this end consists of multi-valued logics, also called “fuzzy” logics [102]. 


1.7.1 Multi-Valued Fuzzy Logics 


Multi-valued fuzzy logics are “models” of propositional logics that assign a “value” 
to each propositional variable, and hence also to each formulaic letter and each well- 
formed propositional form, by means of a table or a formula. 
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el Oe ai =) (Plo |) >@ |@ |= |) > 
distinguished value v [18, e ua ical i v u 
p. 113]. u v |w iw w vv |w 

u v ju ju u vo \u 

w wiv jv v v jv 

w w iw flv w vo ojv 

w wiu ju u v \u 

v ujv jv v vjv 

v u|w flv w vo ofjv 

v u ju jv u v jv 


1.90 Definition. A value in propositional logic may be any symbol that is not 
already allowed in well-formed propositional formulae. Thus a value may not be 
a connective, formulaic or propositional letter, parenthesis, bracket, or brace. 


For the present purposes three values suffice, denoted here by u, v, and w. A logic 
with exactly three values is also called a triadic logic. A multi-valued logic also 
designates any one of its values as the distinguished value, for example, v (Fig. 1.1). 


1.91 Definition. A propositional form holds, or is valid or a tautology, in a multi- 
valued logic if and only if it has the distinguished value regardless of the values of 
its components. The notation F P indicates that P is valid [72, p. 12 & p. 14]. 


1.92 Remark. Such software packages as John Harrison’s program [54, 55] and 
Stephen Wolfram’s Mathematica [142] provide facilities to calculate and print 
multi-valued Truth tables of propositional forms, as explained in [102]. 


1.93 Example (Church’s triadic logic). Table 1.94 defines (by fiat) the values of 
the negation —(P) and of the implication (P) => (Q) from the values of P and Q in 
Church’s triadic logic [18, p. 113, un-numbered table, penultimate column]. The last 
column shows how to derive the values of the compound formula (Q) > [(P) > 
(Q)] from the values of its components. The formula (Q) = [(P) => (Q)] is valid 
because it has the distinguished value v regardless of the values of P and Q. 


1.95 Example (Lukasiewicz’s triadic logic). Table 1.96 defines (by fiat) the values 
of the negation —(P) and of the implication (P) = (Q) from the values of P and 
Q in Lukasiewicz’s triadic logic [18, 79, p. 113, un-numbered table, last column]. 
The last column of table 1.96 shows how to derive the values of the compound 
formula (Q) => [(P) => (Q)] from the values of its components. The formula 
(Q) => [(P) => (Q)| is valid because it has the distinguished value v regardless of 
the values of P and Q. 


1.7.2, Sound Multi- Valued Fuzzy Logics 


In some logics, validity provides a necessary criterion for provability: if there is a 
proof of a theorem L, then L is valid. Such logics are called sound. 
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ee vere -@) |P [2 |M2@ |@ [> [lH= Ol 
logic with values u, w, 
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v uj|w jv w v jv 
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Fig. 1.1 Triadic Truth-tables 
on page 640 of Charles 
Sanders Peirce’s 1909 Logic 
Notebook: MS Am 1632 
(339) Houghton Library, 
Harvard University, used by 
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permission. (http://pds.lib. 
harvard.edu/pds/view/ 
15255301 ?n=640& 
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1.97 Definition. A multi-valued fuzzy logic is sound if and only if all its theorems 
are valid: for every formula L, if + L (if Lis a theorem), then — L (then L is valid). 


Every axiom is also a theorem and thus must be valid in a sound logic. 


1.98 Example. Table 1.94 in example 1.93 shows that Frege’s axiom P1 (the law 
of affirmation of the consequent) is valid in Church’s triadic logic. Similarly, 
exercise 1.83 confirms that Frege’s axiom P2 (the law of self-distributivity of 
implication) is also valid in Church’s triadic logic. Thus all the axioms of the Pure 
Positive Implicational Propositional Calculus are valid in Church’s triadic logic. 


If a sound logic allows for inferences by Detachment, then Detachment must be 
“sound” or “preserve” valid formulae, in the sense that from valid premisses P and 
(P) => (Q) Detachment must produce a valid conclusion Q. 
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1.99 Example. Table 1.94 in example 1.93 contains only one line where P and 
(P) => (Q) both have the distinguished value (v): only in the first line. In that line 
Q also has the distinguished value (uv). Thus Detachment preserves valid formulae 
in Church’s triadic logic. 


1.100 Example. Table 1.96 in example 1.95 contains only one line where P and 
(P) => (Q) both have the distinguished value (v): only in the first line. In that line 
Q also has the distinguished value (v). Thus Detachment preserves valid formulae 
in Lukasiewicz’s triadic logic. 


Conversely, theorem 1.101 confirms that sound axioms and a sound Detachment 
suffice for a propositional logic to be sound. 


1.101 Theorem. Suppose that a multi-valued propositional logic is such that its 
axioms are valid and Detachment from valid premisses produces a valid conclusion: 
for all propositional forms H and C, if H is valid, and if (H) = (C) is valid, then C 
must be valid. Then such a logic is sound. 


Proof (Outline). This proof shows that each step of a proof is valid. Each step C of 
a proof is either an axiom or the result of Detachment. In particular, every formal 
proof starts with an axiom. 

If C is an axiom, then C is valid by hypothesis. 

Suppose that all the steps up to but not including C have already been proved 
valid. If C results from Detachment from previous steps H and (H) => (C), then H 
and (H) = (C) are valid, and hence C is valid by the hypotheses of the theorem. 

Thus every step of a proof is a valid formula. In particular, the last step of a proof 
is also a valid formula. 

A rigorous proof uses the Principle of Mathematical Induction in chapter4. O 


1.102 Example. Examples 1.98 and 1.99 show that in Church’s triadic logic 
Detachment is sound and all the axioms of the Pure Positive Implicational Proposi- 
tional Calculus are valid. Consequently, theorem 1.101 shows that the Pure Positive 
Implicational Propositional Calculus with Church’s triadic logic is sound: every 
theorem of the Pure Positive Implicational Propositional Calculus is a valid formula 
in Church’s triadic logic. 


1.7.3 Independence and Unprovability 


In a sound logic, every theorem is valid. By contraposition, if a formula is not valid, 
then it is not a theorem. Consequently, to prove that there are no proofs of a formula 
L from specified axioms and rules of inference, it suffices to find a system of logical 
values where the logic is sound but the formula L is not valid. 


1.103 Example. Example 1.102 shows that the Pure Positive Implicational Proposi- 
tional Calculus is sound with Church’s triadic logic defined by table 1.94. However, 
table 1.104 shows that Peirce’s Law is not valid in this logic. Consequently, Peirce’s 
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Table 1.104 Peirce’s Law in 


P P)=> => \|(P => |(P 
Church’s triadic logic [18, Q {@) (Q)] ()} (P) 
viv |v v v v v 
p. 113]. 
v |w iw v v v v 
v iu \u v v v v 
wiv jv Ww w v w 
wiw iv w |w v w 
wiu |u v w w |w 
ujiv |v u u v u 
u|w jv u u v u 
u jiu |v u u v u 


Law is unprovable in — is not a theorem of — the Pure Positive Implicational 
Propositional Calculus: there are no proofs of Peirce’s Law using only Detachment 
and Frege’s axiom P1 (the law of affirmation of the consequent) and axiom P2 (the 
law of self-distributivity of implication). 


Different analyses of different formulae may require different multi-valued fuzzy 
logics. For instance, the question whether Frege’s axiom P2 (the law of self- 
distributivity of implication) can be proved from axiom P1 (the law of affirmation 
of the consequent) cannot be answered from Church’s triadic logic defined by 
table 1.94, because both axioms are valid there. However, another multi-valued 
fuzzy logic answers the question. 


1.105 Example. Table 1.96 in example 1.95 shows that Frege’s axiom P1 (the law 
of affirmation of the consequent) and Detachment are valid in Lukasiewicz’s triadic 
logic. However, exercise 1.83 reveals that axiom P2 (the law of self-distributivity 
of implication) is not valid in the same logic. Therefore, axiom P2 is not provable 
in — is not a theorem of — the implicational propositional logic with Detachment 
and only one axiom: the single axiom P1. 


1.106 Definition. A logical formula L is logically independent from a system of 
axioms and rules of inference if and only if L is not provable in — is not a theorem 
of — the logic defined by the system of axioms and rules of inference. 


1.107 Example. Example 1.103 shows that Peirce’s Law is independent from the 
Pure Positive Implicational Propositional Calculus. 


1.108 Example. Example 1.105 shows that Frege’s axiom P2 (the law of self- 
distributivity of implication) is independent from the implicational propositional 
logic defined by Frege’s axiom P1 (the law of affirmation of the consequent) and 
Detachment. 


In general, the concept of mathematical as opposed to practical impossibility 
seems difficult to explain to the general public [28, 29]. Nevertheless, as just 
demonstrated, multi-valued logic provides elementary examples of mathematical 
impossibilities, in the form of rigorous proofs of the nonexistence of proofs of 
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specific formulae, and thereby an explanation of the concept — though not the 
proof — of such impossibilities as Arrow’s Impossibility Theorem in voting 
presented in chapter 7, ruler-and-compass angle trisection, duplication of the cube, 
or a proof of Euclid’s fifth postulate in geometry, and a solution to the decision 
problem in first-order logic, or Gddel’s Incompleteness Theorem in logic. 


1.7.4 Complete Multi- Valued Fuzzy Logics 


In some logics, validity provides a sufficient criterion for provability: if a formula L 
is valid, then there is a proof of L. Such logics are called complete. 


1.109 Definition. A multi-valued logic is complete if and only if all its valid 
formulae are theorems: for every formula L, if F L (if L is valid), then F L (then L 
is a theorem). Otherwise, if there are valid formulae that are not theorems, then the 
logic is called incomplete. 


1.110 Example. In table 1.94 from example 1.93, deleting all the lines where P or 
Q may take the value w, thus keeping only the lines where P and Q may take only 
the value u or v, gives a two-valued logic where Frege’s axiom P2 (the law of self- 
distributivity of implication) and axiom P| (the law of affirmation of the consequent) 
and Detachment are still valid, because they are valid in the larger table 1.94. 

Similarly, in table 1.104 from example 1.103, deleting all the lines where P or Q 
may take the value w, thus keeping only the lines where P and Q may take only the 
value u or v, gives table 1.111, where Peirce’s Law is valid. 

Yet example 1.103 shows that Peirce’s Law is not a theorem of the Pure Positive 
Implicational Propositional Calculus. Consequently, the Pure Positive Implicational 
Propositional Calculus with the dyadic logic defined by table 1.111 is incomplete. 


In a two-valued (dyadic) Boolean logic, the distinguished value v is also called 
True or denoted by 1, while the other value u is also called False or denoted by 0. 


1.111 Example. The three axioms formed by Peirce’s Law with Axioms P1 and P2 
in addition to Detachment lead to a complete logic, where every Boolean dyadic 
tautology has a proof [108, p. 25]. 


Table 1.111 Peirce’s Law is P 
valid in Boolean dyadic logic. 
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1.7.5 Peirce’s Law as a Denial of the Antecedent 


Theorems 1.67, 1.68, and 1.78 show that the formula [(P) => (Q)] => (Q) uses 
only the connective = but is equivalent to (P) V (Q) in the Classical Propositional 
Calculus. Similarly, without negations, using only implications, Charles Sanders 
Peirce’s Law 


(Peirce’s Law) {[(P) => (Q)] = (P)} = (P) (1.1) 


[104, p. 189-190] corresponds to the pattern of reasoning by the Law of the 
Excluded Middle (B) v [=(B)] as follows. With Boolean logic, in Peirce’s Law (1.1) 
the formula (P) => (Q) is True for every proposition Q if and only if P is False. For 
this reason, (P) = (Q) is a form of a denial of the antecedent P, as in the Law of 
Denial of the Antecedent [=(P)] = [(P) => (Q)] (theorem 1.40): 


Peirce took ‘p D a’ as the denial of whatever statement ‘p’ abbreviates on the ground that 
to say of a statement ‘It implies everything,’ was tantamount to saying of it ‘It is false. —[9, 
p. 157] 


With P and Q replaced by (P) = (Q) and P respectively, if the denial (P) > (Q) 
is False, then [((P) = (Q)] => (P) is True, whence, by Peirce’s Law (1.1) and 
Detachment, the consequent P is True. In other words, from the Falsity of the denial 
of P follows the Truth of P. Equivalently, if —(P) is False, then P must be True. In 
this sense, using only the implicational connective = Peirce’s Law (1.1) expresses 
the law of the excluded middle (P) v [=(P)] [104, p. 189-190]. 


1.7.6 Exercises on Church’s and Lukasiewicz’s Triadic Systems 


For each of the following formulae, determine whether it is a triadic tautology with 
Church’s table 1.94, or Lukasiewicz’s table 1.96, or both, or neither. 


1.77. [(P) = [(Q)] = {[(Q) > (R)] > [(P) > II}. 

1.78. (P) > (P). 

1.79. {{(P) > (Q)] > (P)} => (P). 

1.80. [(P) > (Q)] > ({(P) = [FF@I}} = [-))). 

1.81. {[(P) > (Q)] > (Q)} > (Q) = (P)] > (P)}. 

1.82. [=(P)] > [(P) > (Q)]. 

1.83. {(P) > [(Q) > (A]}} = IP) > (Q)] > [P) => (Bh. 
1.84. [(P) > (Q)] > ({) => [-@)]} > [-))). 

1.85. {(P) = [(Q) > (R)}} = (2) = [(P) > (®)]}. 

1.86. {[-(Q)] > [-(P)]}} > [P) > (Q)]. 
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1.8 Boolean Logic 


Boolean logic is a dyadic logic: a multi-valued logic with exactly two values. The 
distinguished value may be called “True” and denoted by v, whereas the other value 
may be called “False” and denoted by u. Table 1.113 defines the Boolean Truth 
values of compound propositional formulae with selected connectives. 

Boolean logic will lead to an algorithm to design proofs in the Full Propositional 
Calculus in section 1.9. 


1.8.1 The Truth Table of the Logical Implication 


Based on [101], this subsection clarifies the Truth table of the logical implication. 
True logical implications with a False hypothesis are convenient in classical 

mathematics, but other versions of mathematics do not include them [33]. Moreover, 

True logical implications with a False hypothesis rarely occur in practical reasoning: 


Actually, the rule that any conditional is true if its antecedent is known to be false has almost 
no parallel in natural logic. Examples of the type “if snow is black, then 2 x 2 = 5,” which 
keep cropping up in textbooks, are only capable of confusing the student, since no natural 
subsystem in our language has expressions with this semantics [81, p. 36]. 


Correspondingly, some computers and logical circuits do not include any facility to 
test the Truth value of logical implications. Therefore the following four examples 
serve solely to demonstrate the difference between Boolean algebraic logic and 
practical reasoning. There exist other logics, but they are less used and more 
complicated than Boolean logic [18, p. 142, p. 146, §26.11]. 

The present subsection shows that the Truth table for the Boolean logical 
implication is the only Truth table that satisfies certain requirements. Specifically, 
the present considerations confirm that the complete law of contraposition 


[(P) = Q] > t{->-@1 => Fp 


and the nonequivalence of an implication (P) = (Q) with its converse (Q) => (P) 
hold only with implications defined as in table 1.113. To this end, denote by = any 
candidate connective for a logical implication. To reflect common experience, as in 


Table 1.113 Boolean dyadic logic with values u and distinguished value v. 


=P) |P 12 |P)>@ PIA) | (P)V @) | [(P) NoR (Q)] | (P) > Q) 


Uu Vv 0) 0) Vv Uv u Vv 
Uu Vv u u u VU u u 
VU u Uv Vv u Uv u u 
VU u u 0) u u 0) Vv 
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Table 1.114 A partial Truth table for a 
connective (P) =3 (Q) conforming to 
(Implication. 1) & (Implication. 2). 
REQUIREMENT |P |Q | (P) 3 (Q) 
(mplication.1) |T |T |T 
(Implication.2) |T |F |F 


Table 1.115 The connectives , “>, q, and >. 


PAP) O|P\Q\H)~@|P|e|Pa@|Pe|~/)>@ 
T\|T\T T\|T\T T\|T\|T T\|T\T 
T\|F\F T\|F\F T|F\F T\|F\F 
F\T|T F\T\F F\T\F F\T\|T 
T\|F\F F\F\F F\F\T F\F|T 


example 1.1, any concept of logical implication may have to satisfy the following 
two requirements. 


(Implication.1) To allow for reasoning by Detachment, if a hypothesis P holds, and 
if the implication (P) = (Q) holds, then the conclusion Q also holds. 

(Implication.2) To avoid faulty reasoning, if a hypothesis P holds, but if the 
conclusion Q fails, then the implication (P) = (Q) also fails. 

(Implication.3) To allow reasoning by contraposition, an implication and its con- 
traposition must have the same Truth table. 

(Implication.4) Again to prevent faulty reasoning, an implication and its converse 
must have different Truth tables. 


The first two requirements, (Implication. 1) & (Implication. 2), dictate the first two 
lines of Truth table 1.114, where the hypothesis P is True. There remain only 
four possibilities for == in the last two rows, where the hypothesis is False. For 
convenience, denote these four connectives by >, %, », and ~ respectively 
(these last three symbols are used in this manner only in the present discussion). 
Table 1.115 shows their Truth values. For comparison, the last two requirements, 
(Implication. 3) & (Implication. 4), dictate the last two lines of the desired Truth 
table. 

Verifications based on table 1.113. confirm that the logical implication > has 
the same Truth values as its contraposition has, but not as its converse has. 

In contrast, (P) 9+ (Q) and its contraposition [=(Q)] 9+ [-(P)] have different 
Truth values, as in table 1.116. Also, for the connective 9%, neither the law of 
contraposition nor its converse holds. 

Similarly, (P) ~> (Q) and its contraposition [=(Q)] ~»> [=(P)] do not have the 
same Truth values, as in table 1.117. Moreover, for the connective >, neither the 
law of contraposition nor its converse hold. 
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Table 1.116 The connective 


> and its contraposition do Pig |@) @ [Fol Fel 
not have the same Truth aes F 
values. T |F UF F 

F\T \T T 

F \F \F T 


Table 1.117 The connective 


: a P|\0/|/\H)~@Q |FO)]~ Fe) 
~> and its contraposition do Tirir F 
not have the same Truth 
values. TF \F F 
F \T \F F 
F \F \F T 
Table 1.118 The connective P\0\(P)A(O) (Oa) 
cy and its converse have the 
T |T \T T 
same Truth values. 
T |F \F F 
F\T \F F 
F\F \T T 


Finally, (P) ~ (Q) and its converse (Q) ~ (P) have the same Truth values, as 
in table 1.118. However, for the connective ~, both the law of contraposition and 
its converse hold. 

Thus the Truth table 1.113 specified for (P) = (Q) is the only one that reflects 
experience. There exist other concepts of logical implication, but they do not lend 
themselves to Truth tables [18, p. 146, #26.12]. 


1.8.2. Boolean Logic on Earth and in Space 


Some logical connectives — for instance, NOR — can combine so as to play the rdle 
of every logical connective. 


1.119 Definition. A logical connective is called primitive, or also universal, if 
and only if every propositional form is logically equivalent to a propositional form 
containing only that connective. 


1.120 Example. The logical connective NOR, defined so that (A) NOR (B) stands 
for -(A V B), is universal; in particular, the following equivalences are tautologies: 


[=(P)] = [(P) Nor (P)], 
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[(P) 4 (Q)] = {[(P) NOR (P)] NOR [(Q) NOR (Q)] }. 
[(P) Vv (Q)] = {[(P) NOR (Q)] NOR [(P) NOR (Q)]}, 
[(P) > (Q)] + t[(P)NOR(P)|NOR(Q)}NOR{[(P)NOR(P)|NOR(Q)}. 


1.121 Example. The logical connective NAND, defined so that (A) NAND (B) stands 
for -(A A B), is universal; in particular, the following equivalences are tautologies: 


[=(P)] = [(P) NAND (P)], 
[(P) A (Q)] = { [(P) NAND (Q)] NAND [(P) NAND (Q)] }, 
[(P) v (Q)] <= {[(P) NAND (P)] NAND [(Q) NAND (Q)]}, 
[(P) > (Q)] > (P) NAND [(Q) NAND (Q)]. 


1,122 Example. For logic and arithmetic, Westinghouse DPS-2402 Computers used 
only NAND gates: 


Its function can be considered fundamental in that all sequential and combinational logic 
functions can be performed entirely by NAND gates [138, Section 4, § 4-1(7)(a), p. 4-4]. 


1.123 Example. The Apollo spacecraft contained two electrically identical Apollo 
Guidance Computers (AGC): a Command Module Computer (CMC) in the Com- 
mand Module (CM), and a Lunar Module Computer (LMC) in the Lunar Module 
(LM) [51, § 2.1, p. 23], pictured in figure 1.2. 


Fig. 1.2 The logic in the 
present chapter was used in 
the Apollo Guidance 
Computer onboard the 
Command and Lunar 
Modules. (Neil A. 
Armstrong’s photograph of 
Edwin E. Aldrin Jr., 20 July 
1969, NASA ID 
AS11-40-5927.) 
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Each AGC used only one type of logical circuit: a NOR gate with three variables, 
such that NOR (A, B, C) is equivalent to =(A Vv B v C) [51, § 3.2.1, p. 60, fig. 3-1]. 
Setting C to False (OV) shows that NOR (A, B, F) is equivalent to —(A v B) and 
hence to (A) NOR (B). By universality of NOR, every logical, arithmetic, reading, 
writing, and copying operation necessary during space flight was implemented by 
circuits consisting entirely and exclusively of NOR gates [51, § 3.2.1, p. 62]. The 
universality of NOR provides a reliability greater than several different connectives 
would: 


The single logic type simplified packaging, manufacturing, and testing, and gave higher 
confidence to the reliability predictions [51, § 1.1, p. 10]. 


1.9 Automated Theorem Proving 


Relying on the Deduction Theorem 1.22, the Provability Theorem and the Com- 
pleteness Theorem will provide not only guidance but an algorithm to design proofs 
within the propositional calculus. 


1.9.1 The Provability Theorem 


The full Classical Propositional Calculus based on axioms P1, P2, and P3 is 
absolutely complete: every tautology is a theorem. Moreover, there are algorithms 
to determine for each propositional form whether it is a theorem, and, if it is, to 
design a proof of it. The demonstration relies on the following notation. 


1.124 Definition. For each proposition P, define a proposition P’ by 
P= P if P is True, 
~~ | A(P) if P is False. 
The following theorem constitutes a first step in a proof of completeness. 


1.125 Theorem (Provability Theorem). For each propositional form S with 
propositional variables from a finite list P,...,R, there exists a proof of S’ from 
Pag Re 


Po... RES’. 


Proof (Outline). This proof proceeds by cases, removing from S one connective at 
each step. 
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Negation 


If S is =(V), then V has one fewer connective than S has. Suppose that P’,...,R’ 
V’ has already been proved, and consider two cases. 


S True If Sis True, then S’ is S. However, if Sis True, then V is False, and V’ is 
-=(V), which is S and hence also S’. Thus P’,...,R’ - V’ by the hypothesis on 
V, and substituting S’ for V’ yields P’,...,R’ FS’. 

S False In contrast, if S is False, then S’ is =(S). However, if S is False, then V is 
True, and V’ is V. Thus P’,...,R’ F V’ by the hypothesis on V, and substituting 
V for V’ yields P’,...,R’ & V. Hence, appending a proof of the converse law 
of double negation (theorem 1.43) produces a proof of P’,...,R’ F {-=[-(V)]}, 
and substituting S for =(V) gives P’,...,R’ F [5(S)], whence substituting S’ for 
—=(S) yields P’,...,R’ EF S’. 


Implication 


If Sis (V) = (W), then V and W have fewer connectives than S has. Suppose that 
P’,...,R’ & VW’ and P’,...,R’ - W’ have already been proved, and consider two 
cases. The first two cases occur if S is True, which occurs if W is True or V is False. 


S True, W True If Wis True, then W is W’ and by the hypothesis on W there exists 
a proof of P’,...,.R’ H W. Again because W is True, it follows from theorem 1.12 
that (V) = (W) is also True, and appending a proof of theorem 1.12 after 
the proof of P’,...,R’ / W produces a proof of P’,...,R’ - [(V) > (W)]. 
However, because (V) = (W) is True and (V) > (W) is S, it also follows that 
is S’, whence P’,...,R’ F [(V) > (W)] is P’,...,R’ ES’. 

S True, W False _ If V is False, then V’ is =(V) and True. Thus there exists a proof 
of P’,...,R’ - V’ by the hypothesis on V, which is thus a proof of P’,...,R’ 
[-=(V)]. Hence the law of denial of the antecedent (theorem 1.40) gives a proof 
of [=(V)] => [(V) => (W)], which is [-(V)] = (5S), and thence the transitivity 
of implications (theorem 1.16) yields a proof of P’,...,R’ / S, which is also a 
proof of P’,...,R’ FS’. 

S False The third case occurs if S is False, which occurs if and only if V is True 
and W is False. Then S’ is =(S) and W’ is =(W) but V’ is V. By the hypotheses 
on V and W, there exist proofs of P’,...,R’ - V’ and P’,...,R’ F W’, which 
are thus proofs of P’,...,R’ & V and P’,...,R’ - [-(W)]. Appending a proof 
of theorem 1.54 then gives a proof of P’,...,R’ F {(V) A [=(W)]}, whence the 
definition (1.51) of A produces a proof of P’,...,R’ F {-[(V) > {>[-(W)}}]}. 
Thence the converse law of double negation, transitivity applied to 


IY) = Wl > [WM = FrM))}] = [Y) = FF py} 


and contraposition yield a proof of P’,...,R’ F {-[(V) = (W)]}, which is a 
proof of P’,...,.R’ - [-(S)] and hence also a proof of P’,...,R’ FS’. 
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The general case follows by several applications of the previous cases, in a way 
that may be specified more explicitly after the availability of the Principle of 
Mathematical Induction in chapter 4. oO 


1.9.2. The Completeness Theorem 


The Completeness Theorem shows that within the full classical propositional 
calculus every tautology is a theorem, provable from the axioms and the rules of 
inference. 


1.126 Theorem (Completeness Theorem). Within the full classical propositional 
calculus, every tautology is a theorem. 


Proof. This proof uses the Deduction Theorem 1.22 and the Provability The- 
orem 1.125, removing at each step one propositional variable that occurs in a 


tautology. 

For every tautology S with propositional variables P,...,Q,R, theorem 1.125 
produces a proof of P’,..., Q’, R’ F S, because S’ is S. Two cases arise with the last 
variable R. 


RTrue If R is True, then R’ is R, whence from the proof of P’,...,Q’, RS, the 
Deduction Theorem gives a proof of P’,...,Q’ F [(R) = (S)]. 

R False If Ris False, then R’ is —(R), whence from the proof of P’,...,Q’,R’ FS, 
the Deduction Theorem gives a proof of P’,..., Q’ - {[=(R)] = (S)}. 


A proof of P’,..., Q’ F S follows by the principle of proofs by cases (theorem 1.85): 
F ([(R) = (S)] A {[7>)] = (S)}) = (). 
Thus the Deduction Theorem reduces the number of propositional variables by 


1. Therefore, applying the Deduction Theorem as many times as the number of 
propositional variables in S yields a proof of F S. oO 


1.9.3 Example: Peirce’s Law from the Completeness Theorem 


The following considerations demonstrate how to plan the design of a proof by the 
Completeness Theorem (theorem 1.126), here with the example of Peirce’s Law: 


t[P) > (Q)] = (P)} = (P). 


To apply the Completeness Theorem, let S designate Peirce’s Law. Because S 
involves only two propositional variables, P and Q, for each of the four combi- 
nations of Truth values of P and Q, the Completeness Theorem first invokes the 
Provability Theorem (theorem 1.125) for a separate proof of P’, Q’ + S’, here 
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P.OF[L[P) > Ol> MP} > PY’. 


In all cases, S has the propositional form (V) = (W), where W is P, and where V 
is (H) => (K), with (P) > (Q) for H, and P for K: 


S 
————$<——_———. 
jeu K 


——$———— —_— 
{H>Ol> M}> @). 
—ee-'V—1—191"---—=—=—"——"” —— 


4 Ww 


P True, Q True If P is True, then W is also True, because W is P. Hence the 
Provability Theorem calls for a proof of P | W, which is here PF P. Thence the 
Deduction Theorem provides a proof of (P) => (W), in effect here the proof of 
theorem 1.14. Because S has the form (V) = (W), the proof just obtained gives 
the following main steps (the final complete proof replaces every theorem cited 
by a complete proof of that theorem). 


r (P) = (W) theorem 1.14, 
- (W) > [(V) > (W)] axiom P1, 
F (P) > (VV) > (W)] theorem 1.16, 
F (P)=> [{[(P) > (Q)] => (P)} (P) | substitutions. 
SKS —<—SS 
Vv Ww 


Alternatively axiom P1 yields the conclusion directly, but the foregoing deriva- 
tion serves to illustrate the use of the Completeness Theorem. 

P True, Q False Because P is again True, the preceding reasoning remains valid 
because it does not use the Truth value of Q. 

P False, Q True If P is False, then so is W. Hence the Provability Theorem calls 
for a proof of V’. Here V is [(P) = (Q)] => (P), which has the form (H) => (K). 
With P False, H is True and K is False, whence V is False. Consequently, V’ is 
-=(V), which has the form —[(H) = (K)]. Therefore, the Provability Theorem 
calls for proofs of P’, Q’ + H and P’, Q’ F [=(K)]. 

Here P’,Q’ | [-(K)] is [7(P)],Q’ - {-[F(P)]}, which follows from the 
substitution [—(P)] = [-(P)] in the proof of theorem 1.14. 

Also, P’,Q’ | H is [=(P)],Q’ | [(P) => (Q)], where P is False. Thus the 
Provability Theorem calls for a proof of [=(P)],Q’ - [=(P)], which again 
follows from the substitution [=(P)] = [-(P)] in the proof of theorem 1.14. 
Hence [-(P)] => [(P) => (Q)] by the law of denial of the antecedent 
(theorem 1.40). 

These proofs of [=(P)], Q’ | H and [=(P)], Q’ - [=(K)] complete the proof of 
[=(P)], @’ - {=[(4) => (K)]}, which is [=(P)], Q’ F [-(V)]. Again the law of 
denial of the antecedent gives a proof of [=(P)], Q’ - [(V) = (W)], which is 
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[=(P)], Q’ F S. The proof just obtained gives the following main steps (the final 
proof replaces every theorem cited by a complete proof of that theorem). 


F [A(P)] > [7(K)] theorem 1.14, 
F [-(P)] > [-(P)] theorem 1.14, 
F [=(P)] = {[-(Q)] > [-()]} axiom P1, 
F {[-(Q)] = [-(P)]} => [(P) = (Q)] axiom P3, 
F [-(P)] => [(P) => (Q)] theorem 1.16, 
—+_PE—_——— 
H 
F [-=(P)] > (A) substitution; 
F [-(P)] > (A) A [F(K)]}} theorem 1.55, 
F [-(P)] > {-[(A) = (K)]} definition of A, 
Se ery ee) 
Vv 
F [-(P)] > [->V)] substitution; 
F [A(V)] = {1-(W)] = [FM] axiom P1, 
F [>(P)] = (FW) = FMF theorem 1.16, 
F {[-(W)] > [7 = [(V) = (W)] axiom P3, 
F [=(P)] > [(V) > (W)] theorem 1.16, 
a ee 


S 


F [=(P)| > [{[(P) > (Q)| > (P)} (P) | substitutions. 
V Ww 


P False, Q False Because P is again False, the preceding reasoning remains valid 
because it does not use the Truth value of Q. 


From the preceding proofs of (P) = (S) and [=(P)] = (S), the principle of 
proofs by cases (theorem 1.85) yields a proof of Peirce’s Law (S). Subsequent 
examinations of the proof produced by the Completeness Theorem can yield 
simplifications. 


+ (P) = [{[(P) > (Q)] > (P)} = (P)| axiom P|, 


F (P) = (S) substitution; 

F [=(P)] > {(A) A [=(P)]} axiom P1, 

F [>(P)] > {-[) = (1)]} definition of A, 
—S ——— 


F [-(P)] > [->W)] " substitution; 
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F [AWV)] = {FW)] = FOI} axiom P1, 

F [A(P)] = {F}W)] = [FFM ]} theorem 1.16, 

F {[->(W)] = [7(V)]} = [(V) = (W)] axiom P3, 

F [-(P)] = [((V) = (W)] theorem 1.16, 
—{_ 


S 
F [-(P)] = (S) substitution; 


F (P) > (S) previous result; 
ES rule of inference; 
- {[(P) => (Q)] => (P)} => (P) substitution. 


1.9.4 Exercises on the Deduction Theorem 


1.87. Assume that [-=(P)] = (P) holds and prove that P holds. In other words, 
prove that {[-=(P)] = (P)} (P). 


1.88 . Assume that (P) = [—=(P)] holds and prove that =(P) holds. In other words, 
prove that {(P) > [-(P)]} - [-(P)]. 


1.89. Apply the Deduction Theorem to prove {[—=(P)] = (P)} => (P). 
1.90. Apply the Deduction Theorem to prove {(P) = [-=(P)]} > [-(P)] . 


1.91. Assume (P) => (Q) and prove {(P) > [-(Q)]} => [A(P)]. In other words, 
prove that [(P) = (Q)] F ({(P) > [-(@)]} F [-@)]) - 


1.92. Apply the Deduction Theorem to prove the law of reductio ad absurdum 


[(P) > (Q)] > ({(P) > [-(Q)]}} > FP). 


1.93. Assume [—(P)] = (Q) and prove {[=(P)] = [-(Q)]} => (P). In other 
words, prove that {[=(P)] > (Q)} F [{[-()] > [-@)]}} > (P)]. 


1.94. Apply the Deduction Theorem to prove the law of indirect proof 
{F-@)] > FO} > [-@)] > FO} > (P)]- 


1.95. Apply the Deduction Theorem to prove the tautology 


{@) >= Q] => (Q => ®Ii > (Q) = I. 
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1.96. Apply the Deduction Theorem to prove the tautology 


IP) = Q] > [(Q@ > @HB => IP) = @] > [P) > @I}. 


1.97. Apply the Completeness Theorem to prove {[=(P)] => (P)} => (P). 


1.98. Apply the Completeness Theorem to prove the special law of reductio ad 
absurdum: {(P) => [=(P)]} > [-(P)] - 


1.99. Apply the Completeness Theorem to prove the law of reductio ad absurdum: 
[(P) >= Q)1 > P) > FQ} = [-)). 


1.100. Apply the Completeness Theorem to prove the law of assertion: 


(P) = iP) = (Q)] > (Q)}. 


1.101. Apply the Completeness Theorem to prove 


t[(P) => (Q)] = (R)} => {[(R) = (P)] > (PP. 


1.102. Apply the Completeness Theorem to prove 


{@) >= Q] => (R} = ®) > )] = (GS) > (Pp. 


1.103 . Apply the Completeness Theorem to prove 


[{[(P) > (R)] > (2)} > (Q)] > {[(2) > (R)] > [(P) > @}. 


1.104. Apply the Completeness Theorem to prove 


[(R) > (Q)] > ({[(R) > (Q)] > (P)} > [(S) > (P))). 


1.105. Apply the Completeness Theorem to prove 


[(R) > Q)] => (6) > ()P > 1) > )] = [6) > )B- 


1.106. Apply the Completeness Theorem to prove 


(A> P+ P}j+(9=>P))) + (lOO) @)}+ (9) )). 
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Chapter 2 
First-Order Logic: Proofs with Quantifiers 


2.1 Introduction 


This chapter introduces quantifiers and first-order logic. The first few sections 
demonstrate methods for designing proofs through preliminary versions of the 
Deduction Theorem for first-order logic, Substitutivity of Equivalences, and trans- 
formations into prenex forms. A final section derives features of predicates for 
equality and inequality, either as primitive predicate constants, or predicates defined 
from other primitive binary predicate constants. The prerequisite for this chapter 
is a working knowledge of the Classical Propositional Logic for instance, as in 
chapter 1. 

Pure first-order logic includes quantifiers corresponding to phrases such as “for 
each object” or “there exists an (at least one) object” with templates for functions 
of objects and relations between objects. Applied first-order logic replaces such 
templates with functions and relations specific to areas such as algebra, arithmetic, 
geometry, or set theory. 


2.2 The Pure Predicate Calculus of First Order 


In grammar, the noun “predicate” designates the verb or verbal phrase that makes a 
statement about the subject of a clause. In logic, similarly, a predicate is a part of an 
atomic formula that makes a statement about individual objects in applications. 


2.2.1 Logical Predicates 


The logical concept of predicate depends upon the theory under consideration. 
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2.1 Example (predicates in arithmetic). Some versions of arithmetic have only two 
predicates, which state that a number is the sum or product of two numbers: 
M=K+L _ (read “M equals the sum of K and L”), 
N=KxL __ (read “N equals the product of K and L”), 
or equivalent formulae with a different notation [18, p. 318], [72, p. 202—203]. 


These predicates are called “ternary” because each involves three variables. 


2.2 Example (predicates in geometry). In geometry, a predicate may state that a 
point is on a line, or that a point lies between two other points on the same line, or 
that two segments, or two angles, are congruent, or that a line lies in a plane: 


PeL (read “the point P is on the line L’”’), 

X<Y<Z (read “the point Y is between the points X and Z”’), 
PQ=RS (read “the segment PQ is congruent to the segment RS”), 
ZABC = ZPQR (read “the angle ABC is congruent to the angle POR’), 
LCE (read “the line L lies in the plane E”’), 


or equivalent formulations with a different notation [61, Ch. I]. 


2.3 Example (predicates in set theory). Some versions of set theory have only one 
predicate, which states that a set is an element of a set: 
XE€EY _ (read “X is an element of Y’’), 
@e€eY (read “the empty set is an element of Y”), 
X€@ _ (read “X is an element of the empty set’), 
@€E@ (read ‘the empty set is an element of the empty set’). 
This predicate is called “binary” because it involves two variables, X and Y. 


The formulae in the foregoing examples are called terms or atomic formulae 
because they are the simplest formulae in arithmetic, geometry, and set theory. Thus, 
X € Y isa term, or, in other words, an atomic formula. 

In arithmetic, the symbols 0 and 1 are called individual constants, because they 
always denote the numbers zero and one, respectively. Similarly, in set theory, the 
symbol @ is an individual constant, because it always denotes the empty set. 
In contrast, the symbols = and € are called predicate constants, because they 
always denote the relations of equality and set membership, respectively. Logics that 
include such constants are called applied predicate calculi; they may also include 
other functional constants or relational constants corresponding to other relations 
between objects. In contrast, logics that do not include any constants but allow for 
variables representing arbitrary individuals, predicates, functions, and relations are 
called pure predicate calculi. Thus a pure predicate calculus is a general logic that 
may later apply to algebra, arithmetic, geometry, and set theory as well. 

In applied logics, if an atomic formula contains a variable, then it may, but need 
not, have a Truth value. For example, the formula X € Y has no Truth value, because 
different substitutions for X and Y can yield different Truth values. However some 
formulae may contain variables and yet have a Truth value. 
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2.4 Example. In logics with an “equality” relation, the formula X = X is True for 
every X. 


2.5 Example. In binary arithmetic the formula 0 * X = 0 is True for every X. 
2.6 Example. In binary arithmetic 0 * X = | is False for every X. 
2.7 Example. In set theory the formula X € @ is False for every X. 


2.8 Example. In the theory of well-formed sets the formula X € X is False for 
every X. 


2.2.2 Variables, Quantifiers, and Formulae 


The formulae studied here are those specified in definition 2.9. 


2.9 Definition (well-formed formulae). Select three disjoint lists of symbols. 

Every symbol from the first list of symbols, which may consist of one or more 
letter(s) from a specified alphabet, P, Q,..., optionally with subscript(s) P,, Pop, ..., 
superscript(s) P*, P™, ..., or “middlescript(s)” P|, P||, ..., is called a formulaic 
letter. Such formulaic letters are not parts of the predicate calculus, but they help in 
describing the following rules to define well-formed formulae. 

Also, every symbol from the second list of symbols, which may consist of one 
or more letter(s) from a specified alphabet, A, B, ..., optionally with subscript(s) 
Ay, App, -.-, Superscript(s) A*, A®*, ..., or “middlescript(s)” Al, Al], ..., is called a 
propositional variable or a sentence symbol [31, p. 17]. (Propositional variables 
may later be replaced in pure calculi by functional or relational variables, or in 
applied calculi by atomic formulae, which may include individual variables or 
constants specific to applications.) 

Moreover, every symbol from the third list of symbols, which may consist of one 
or more letter(s) from a specified alphabet, X, Y, ..., optionally with subscript(s) 
X,, Xbp, ..., Superscript(s) X*, X#4, ..., or “middlescript(s)” X|, X||,..., is called an 
individual variable. (Individual variables may later be replaced by items specific 
to applications, for instance, numbers in arithmetic, or points in geometry.) 

Every propositional variable or atomic formula is a well-formed formula. For 
all well-formed formulae P and Q, and for every individual variable X, the following 
four strings of symbols are also well-formed formulae: 


(W1) -(P) — (read “not P”), 

(W2) (P) => (Q) (read “P implies Q” or “if P, then Q”’), 
(W3) VX(P) (read “for each X, P”), 

(W4) AX(P) (read “there exists X such that P’’). 
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Furthermore, only strings of symbols built from letters or variables through appli- 
cations of the rules W1—W4 can be well-formed formulae. Equivalent definitions 
apply to other connectives and to prefix and postfix notations. 


2.2.3 Proper Substitutions of Free or Bound Variables 


In the logic presented here, only individual variables may appear immediately 
after either quantifier, V (read “for each”) or 4 (read “there exists”). Because 
of this restriction, this logic is a first order logic. Logical systems allowing for 
propositional variables to appear immediately after a quantifier are of second or 
higher order. 

In Boolean logic, if a formula P is True regardless of X, but if P also contains 
another variable Z, then substituting Z for X can change the Truth value of P. 


2.10 Counterexample. Consider any context with at least two different objects, 
for instance, two binary numbers in arithmetic, two points in geometry, or two 
sets in set theory. Thus for each object X there exists a different object Z, 
whence VX{AZ[=(X = Z)]} is True. Replacing Z by X in 4Z[—(X = Z)] gives 
AX[-=(X = X)], which is False, because each object equals itself, by example 2.4. 
Thus, replacing Z by X in the True formula VX{AZ[-=(X = Z)]} yields the False 
formula VX{AX[-=(X = X)]}. Similarly, replacing X by Z in the True formula 
VX{AZ|[A(X = Z)]} gives the False formula VZ{SZ[—=(Z = Z)]}. 


One way (not pursued here) to avoid the phenomenon exhibited in counterexam- 
ple 2.10 consists of substituting parameters other than variables [117]. Alternatively, 
counterexample 2.10 shows that substitutions of a variable by another must obey 
certain rules, for instance, with the concepts introduced in definition 2.11. 


2.11 Definition (free or bound variables). For each individual variable X and for 
each logical formula P, an occurrence of the variable X is bound in the formula P 
if and only if in P that occurrence of the variable X immediately follows V or J, 
or if it appears in the scope of the quantifier, which is defined to be between either 
VX (or 4X (and the corresponding right parenthesis )). An occurrence of the variable 
X is free in P if and only if that occurrence of X is not bound in P. A logical formula 
is closed if and only if it does not contain any free occurrence of any variable. 
A logical sentence is a closed logical formula. 


2.12 Example. This example focuses on the formula from counterexample 2.10. 


In the formula 3Z[—(X = Z)], both occurrences of the variable Z are bound. 
In the formula 3Z[—(X = Z)], the only occurrence of the variable X is free. 
The formula 4Z[-(X = Z)] is not closed, because it contains a free occurrence of X. 


In contrast, the formula VX{4Z[—(X = Z)]} is closed, because all occurrences of X 
and Z are bound. 


The following definitions avoid the phenomenon in counterexample 2.10. 
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2.13 Definition (change of bound variables). The substitution of the variable Z 
for each bound occurrence of the variable X in P is denoted by Subb?, (P). Such a 
substitution is called a change of bound variables if and only if Z does not occur 
in P. Such a change of bound variables is proper if and only if X does not occur 
freely in P and Z does not occur in P. 


2.14 Example. In the formula 3Z[-(X = Z)] from counterexample 2.10, the 
substitution of X for the bound occurrences of Z is not a change of bound variables, 
because X already occurs in the formula. 


Definition 2.13 explicitly applies only to variables. In particular, it does not allow 
for substitutions of constants for bound variables. 


2.15 Remark. If Subb3 (P) is a change of bound variables in a formula P, then 
Subb%[Subb3 (P)] reproduces P. 

Indeed, if Subb? (P) is a change of bound variables, then Z does not occur 
in P. Consequently, the only occurrences of Z in Subb? (P) are those replacing 
bound occurrences of X. Therefore, Z does not occur freely in Subb?, (P). Thus, 
Subb%[Subb?(P)] replaces all the occurrences of Z in Subb?,(P), all of which are 
bound, by the initially bound occurrences of X, and reverts to the initial formula P. 

Depending on the axiomatic system, an axiom, theorem, or inference rule may 
declare that two formulae that differ from each other only by changes of bound 
variables are mutually equivalent, so that (P) = [Subb3(P)] [89, p. 181]. 


2.16 Definition (change of free variables). A formula P admits (the substitution 
of) Z for an individual variable X, or, in other words, Z is free (to be substituted) for 
X in P, if and only if in the substitution of Z for every free occurrence of X 


¢ every free occurrence of X becomes a free occurrence of Z, 
or, equivalently, if X, Z, and P satisfy the following condition: 


¢ either Z is an individual constant, or 
e Z isan individual variable, and substituting Z for every free occurrence of X in P 
does not convert any free occurrence of X into a bound occurrence of Z. 


In the present exposition, the notation Subfs (P) states that P admits Z for X, and 
substitutes Z for each free occurrence of X in P. 


2.17 Example. In counterexample 2.10, the formula JZ[-(X = Z)] does not admit 
Z for X, or, in other words, Z is not free for X, because substituting Z for every free 
occurrence of X converts the free occurrence of X into a bound occurrence of Z in 
AZ[=(Z = Z)]. 


Thus another way to avoid the phenomenon exhibited in counterexample 2.10 
consists in allowing only substitutions admitted in the sense of definition 2.16 [72, 
p. 94], [84, p. 48] [108, p. 37], [110, p. 101]. 
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2.18 Remark. If an individual variable Z does not occur in a formula P, then P 
admits Z for each individual variable X. Indeed, Z does not appear in P after any 
quantifier (V or 4); thus Z is not bound in P, so that every free occurrence of X is 
replaced by a free occurrence of Z in Subf%(P). 


2.19 Remark. Jf an individual variable Z does not occur in a formula P, and 
if P does not contain any bound occurrences of an individual variable X, then 
Subfy[Subf® (P)] is P. Indeed, by hypothesis all occurrences of X are free in P, and 
by remark 2.18 they all become free occurrences of Z in Subf} (P). Also, X does not 
occur in Subfy (P). Again by remark 2.18 but with X and Z swapped, all occurrences 
of Z in Subf(P) become free occurrences of X in Subfy[Subfs(P)], which no longer 
contain any occurrences of Z. Thus Subfy[Subf (P)] is P. 


2.20 Remark. All well-formed formulae P and Q result from the construction 
specified in definition 2.9, so that individual variables occur only immediately after 
quantifiers, or in terms (atomic formulae) that are then combined with connectives. 
Consequently, the following pairs of formulae are not only mutually equivalent but 
also mutually identical [117, p. 44]: 


* Subf;[-(P)] and —[Subf(P)]. 

* Subf7[(P) > (Q)] and [Subf7 (P)] > [Subf7(Q)]. 
* Subb3[=(P)] and —[Subb%(P)]. 

* Subb3[(P) > (Q)] and [Subb3 (P)] > [SubbZ(Q)]. 


The following abbreviation is convenient. 
2.21 Definition (abbreviation). The notation 3!X(P) (read “there exists a unique 


X such that P” or “there exists exactly one X such that P’”’) abbreviates the formula 


AX[(P) A (VZ{[Subfy(P)] > (Z = X)})]. 


2.2.4 Axioms and Rules for the Pure Predicate Calculus 


As the axioms of the propositional calculus reflect patterns of deductive reasoning 
with implications and negations, the axioms of the predicate calculus reflect 
patterns of deductive reasoning with quantifiers. There also exist several choices 
of initial axioms for use with quantifiers, for instance, the following axioms 
[18, §30, p. 171-172], [122, p. 170]. 


2.22 Definition. The following axioms of the Pure Predicate Calculus govern the 
use of the universal quantifier V and the existential quantifier 3. 


Axiom Q0 Axioms of the Propositional Calculus, but here with well-formed 
formulae as in definition 2.9, are axioms of the predicate calculus. 


Axiom Q1 (specialization) [VX(P)] > [Subf{(P)], if P admits Z for X. 
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Axiom Q2 {VX[(P) > (Q)]} > {(P) => [VX(Q)]}, if P contains no free X. 
Axiom Q3 {VX[-=(P)]} © {-[HX(P)]}. 
Axiom Q4 {4X[-(P)]} = {A|]VX(P)]}. 


The first axiom (schema) of the predicate calculus (QO) and the rules of inference 
carry all the theorems from the propositional calculus over to theorems of the 
predicate calculus. In particular, different propositional calculi, which may result 
from different axiom systems, may lead to different predicate calculi. 


2.23 Example. From definition 1.4, the following formulae (P1) and (P2) form a 
system of two axioms for the Pure Positive Implicational Propositional Calculus: 
(Pl) (P)=>[(Q) => I, 
(P2)  {(P) = [(Q) > (A = IP) > Q)] = [P) = (A). 


Formulae (P1), (P2), and 


3) {> > [FFP = [P) > @I. 


form a system of three axioms for the Pure Classical Propositional Calculus. 


The second axiom (schema) (Q1) corresponds to the notion that if an individual 
variable X may occur in a formula P, and if P is True regardless of X, in other words, 
if VX(P) is True, then P remains True with X replaced by any individual variable or 
constant Z. If X and Z are the same variable, then axiom QI gives [VX(P)] => (P). 

The third axiom (schema) (Q2) describes the relation between the universal 
quantifier (“for each”) and the logical connective of the Pure Positive Implicational 
Propositional Calculus (“if ... then’). 

The fourth axiom, for the existential quantifier (Q3), states that a formula P is 
False for every X if and only if there does not exist any X for which P is True. 

Similarly, the fifth axiom (schema), for the existential quantifier (Q4), states that 
there exists some X for which P is False if and only if it is False that P is True 
for every X. Axiom Q4 asserts the existence of an object. Consequently, axiom Q4 
applies neither to “empty” theories where nothing exists, nor to logics that require 
not only existence but also the determination of which objects satisfy a formula. 

Besides the axioms, the predicate calculus allows for proofs of theorems through 
the following rules of inference. 


2.24 Definition (rules of inference). The following rules of inference hold. 


2.25 Rule (“Modus Ponens” (abbreviated by M. P.), or “Detachment’’). 


If P is a theorem, and 
if (P) > (Q) isa theorem, 
then Q is a theorem. 
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2.26 Rule (Generalization). 


If P is a theorem, 
then VX(P) is also a theorem. 


2.27 Definition (theorems and proofs). A proof is a sequence of well-formed 
formulae H, K, L, ... P, Q, R, where each formula is either (a substitution in) an 
axiom (schema), or results from a previous formula in the sequence by any rule of 
inference (Detachment, Generalization, or Substitution). 

A formula is a theorem if and only if it is a (usually the last) formula in a proof. 
The notation  R means that R is a theorem. 


2.28 Example. Every axiom of the predicate calculus is a theorem. 


2.2.5 Exercises on Quantifiers 


Each of the following ten exercises lists one formula P. Identify a formula that is 
logically equivalent to —(P) among the same ten exercises. 


2.1. VX[AY(X € Y)] 

2.2. VX[SY(Y € X)] 

2.3. WX[(X € A) > (XE B)| 

2.4. VX{(X EC) > [(X EA) A(X EB)} 

2.5. VX{(X € C) > [(X € A) Vv (X € B)]} 

2.6. AX({(X € C)A[A(X € A)JA[A(&X € BY} V{[>(X € C)JA[(X € A)V(X € B)]}) 
2.7. AX{(X € A) A [7(X € B)]} 

2.8. AX{VY[A(Y € X)]} 


2.9. AX([(X € C)AH&% € A] V 5X € BH] VX € OAK € AA (Ke 
B)}}) 
2.10. AX{VY[(X € Y)}} 


2.2.6 Examples with Implicational and Predicate Calculi 


The examples of theorems and proofs selected for this and the subsequent subsec- 
tions gradually build up a tool to design proofs by substituting mutually equivalent 
formulae for one another. As a first step, the following derived rules of inference 
will simplify proofs by avoiding potentially lengthy instances of axiom Q2. 
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2.29 Theorem (derived rule). /f X is not free in R, and if VX[(R) => (S)] is a 
theorem, then (R) => [VX(S)] is also a theorem. 


Proof. Apply axiom Q2 and Detachment: 


F VX[(R) => (S)| hypothesis, 
F {VX[(R) => (S)]} > {(R) => [VX(S)]} axiom Q2, no free X in R, 
F (R) = [VX(S)] Detachment. 


oO 


2.30 Theorem (derived rule). /f X does not occur freely in R, and if (R) => (S) is 
a theorem, then (R) = [VX(S)] is also a theorem. 


Proof. Apply Generalization and theorem 2.29: 
F (R) => (S) hypothesis, 
- WX[(R) = (S)] Generalization, 
- (R) => [VX(S)] _ theorem 2.29, no free X in R. 
Oo 


Theorem 2.31 reveals a situation where VY [Subf}(U)] may be replaced by 
VX[Subfy(U)], which is VX(U). 


2.31 Theorem (change of bound variables). /f Y does not occur in U, then - 
[¥X(U)] = {VY [Subf{(U)}. 

If Y does not occur in U, and if U does not contain any bound occurrence of X, 
then conversely + [YX(U)] = {VY[Subff{(U)}}. 


Proof. This proof follows Monk’s [89, p. 180, thm. 10.55]. 


+ [VX(U)] > [Subf¥(U)] specialization (axiom Q1), no Y in U, 
———— — ae 
R S 
6 VY{S 


—_——_ i 
+ [VX(U)] => {VY[Subf;(U)]} theorem 2.29, no Y in U, so no Y in R. 

For the converse, if Y does not occur in U, and if U does not contain any bound 
occurrence of X, then each occurrence of X in U is replaced by a free occurrence of 
Y in Subf}(U). Consequently, if V denotes the formula Subf;(U), then V contains 
no occurrences of X and no bound occurrences of Y. Moreover, Subfy[Subfy (U)] 
reproduces U, by remark 2.19. Thus, applying the previous result to the formula V 
and swapping the roles of X and Y give + [VY(V)] > {VX[Subfy(V)]}. Oo 


2.32 Remark (change of bound variables). Theorem 2.31 shows that two formulae 
P and Q that differ from each other only by the names of their bound variables are 
mutually equivalent. Indeed, if the variable Y does not occur in a formula P, and if U 
is any atomic formula (for instance, X € Z) that occurs in P, then U does not contain 
any bound variables; in particular, U does not contain any bound occurrence of X. 
Consequently + [VX(U)] = {V¥[Subff(U)]} by theorem 2.31. This substitution 
may use different new variables Y, Y,, Yi», ..., for different atomic formulae U, U,, 
U,p, ... and then proceed to more complicated components of P. Using the same 
new variables in P and Q results in two identical formulae. 
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2.33 Example. Let P denote the formula VX{4Z[—(X = Z)]}, and let Q denote the 
formula VW{AY[=(W = Y)]}. The variables Y, and ¥,, occur in neither P nor Q. 

In P, let U, denote the atomic formula X = Z. Then F [AZ(U,)] © 
(ay, [Subfy, (U,)]} by theorem 2.31. Let U,, denote the resulting formula 4Y,[-(X = 
Y,)]. Then theorem 2.31 shows that F [VX(Up»)] <> {VYip [Subfy, (Us,)]}. which is 
VY,»{4Y[-(p = Y,)]}. The same formula results from the same procedure applied 
to Q. 

The following selection of theorems also relates the present axioms to other 
axiom systems in subsection 2.2.8. Their proofs follow Church’s [18, p. 186-188]. 


2.34 Theorem. For all P, Q, and X, + {VX[(P) > (Q)]} => {[VX(P)] > (Q)}. 


Proof. Apply the Implicational Calculus with axiom Q1: 
F {VX[(P) = (Q)]} => [(P) = (Q)] axiom QI, 
F [VX(P)] = (P) axiom QI, 
F {V¥X[(P) > (Q)]} > {[VX(P)] > (Q)} derived rule (theorem 1.31). 


oO 


2.35 Theorem. For all P, Q, and X, + {VX[(P) > (Q)]} => {[VX(P)] = 
[VX(Q)]}. 


Proof. Apply the Implicational Calculus with Generalization, axiom Q2, and 
theorems 2.30 and 2.34: 


F {VX[(P) > (Q)]} => {[VX(P)] > (Q)} theorem 2.34, 
_—_—__”’ _—_—_—”’ 
: *yxts} 
———<"—_—~ —————————————— 
+ {VX[(P) > (Q)]} > (WX{[VX(P)] > (Q)}) theorem 2.30, 
b (VX{[¥X(P)] = (Q)}) = {IVX(P)] > [YX(Q)]}_ axiom Q2, 
F {VX[(P) => (Q)]} => {[VX(P)] > [VxX(Q)]} transitivity (1.16). 


2.36 Counterexample. The converse of theorem 2.35, which would be 
[VX(P) = VX(Q)] = {VX[(P) > (Q)}}. 


is False in contexts with two different objects Y and Z, so that =(Y = Z) is True: 


« VX[(X = Y) > (X = Z)| is False, because substituting Y for X gives [(Y = 
Y) > (Y = Z)], which is False, because of the True hypothesis Y = Y and the 
False conclusion Y = Z. 

« VX(X = Y) is False, because substituting Z for X gives (Z = Y), which is False 
by the assumption that —(Y = Z). 

© [VX(X = Y)] => [VX(X = Z)| is True, because of its False hypothesis. 

© {[VX(X = Y)] => [VX(X = Z)]} => {VX[(X = Y) > (X = Z))]} is False, 
because of the True hypothesis and the False conclusion. 
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2.37 Theorem (derived rule). For all P, Q, and X, if (P) => (Q), then 
- [VX(P)] = (Q) and+ [VX(P)] => [VX(Q)]. 


Proof. Apply theorems 2.34 and 2.35: 
F (P) > (Q) hypothesis, 
+ VX[(P) > (Q)] Generalization, 
F {[VX(P)] > (Q)} theorem 2.34 and Detachment, 


F [VX(P)] = [VX(Q)] theorem 2.35 and Detachment. 
Oo 


2.38 Theorem. For all P, Q, and X, if P does not contain any free occurrence of X, 
then F {(P) = [VX(Q)]} + {VX[(P) => (Q)}}. 


Proof. Axiom Q2 gives F {VX[(P) > (Q)]} > {(P) => [VX(Q)]}. For the 
converse, use the Pure Positive Implicational Propositional Calculus with axioms Q1 
and Generalization: 


F [VX(Q)]=>(Q) axiom Q1, 
F {(P)>[VX(Q)]}>{(P)=>[VxX(Q)]} theorem 1.14, 
F {(P)>[VX(Q)]}}>[(P)>(Q)] theorem 1.32, 


b VX({(P)=>[VX(O)]}}=> [(P)=>(Q)]) Generalization, 
—_C —_———— 


R Ss 
R VX{S} 


—_—_—_—____ TR, 
L {(P)=>[VX(Q)]}=>(VX{[(P)=>(Q)]}}) theorem 2.29, no free X in (P)=>[VX(Q)]. 
oO 


2.39 Theorem. For all P and X, if P does not contain any free occurrence of X, 
then F [VX(P)] = (P) and [VX(P)] <= (P). 


Proof. Axiom Q1 gives + [VX(P)] = (P). For the converse, apply theorems 1.14 
and 2.30: 
F (P) => (P) theorem 1.14, 


- (P) => [VX(P)] _ theorem 2.30. 
oO 


2.40 Remark. The statement of theorem 2.39 suggests that if P contains a free 
occurrence of X, then the implication (P) = [VX(P)] may differ from the 
Generalization rule, from + P to infer F VX(P), which applied only if P is a 
theorem. 


2.41 Example. If P denotes the formula X = @, then (P) > [VX(P)] becomes 
(X = @) => [VX(X = @)], which is not a theorem. Indeed, if (X = @) > 
[VX(X = @)] were a theorem, then substituting @ for the free occurrences of X by 
specialization and Detachment would yield (@ = @) => [VX(X = @)], which is 
not a theorem, because @ = @ is True while VX(X = @) is False in set theory. 


Theorem 2.42 provides a converse for theorem 2.35 if X is not free in P. 


2.42 Theorem. For all P, Q, and X, if X does not occur freely in P, then - 
IVX(P)] = [VX(Q)]} > {VX[(P) > (Q)]}. 
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Proof. Use theorems 2.38 and 2.39, with H, K, L, M as in theorem 1.36: 
EK {( P) => [Vx(Q)]} > {VX[(P) => (Q)]} theorem 2.38, no free X in P, 
—— —— er 


H L M 
-( P_)=> [VxX(P)] theorem 2.39, no free X in P, 
H K 
F {[VX(P)] > [vX(Q)Is => {VX[(P) => (Q)]} theorem 1.36. 
K L M 


2.2.7 Examples with Pure Propositional and Predicate Calculi 


The following theorems invoke the full Classical Propositional Calculus, including 
contraposition and its converse for negations, or Tarski’s axioms for equivalences. 


2.43 Theorem. For all P, Q, and X, 
F {WX[(P) <> (Q)]} = {[VX(P)] <> [VX(Q)]}- 
Proof. Apply theorems 2.37 and 2.35 with the transitivity of implication: 
F [(P) @ (Q)] > [((P) > (Q)] definition of =, 
Fr {VX[(P) > (Q)]} > {VX[(P) > (Q)]} theorem 2.37, 
F {VX[(P) > (Q)]} > {[VX(P)] > [VX(Q)]} _ theorem 2.35, 
F {VX[(P) > (Q)]} = {[VX(P)] > [VX(Q)]} _ transitivity. 
The converse conclusion results from the symmetry of < and swapping P and Q. 
The final result then follows from theorem 1.55. oO 


For the records, theorem 2.44 combines theorems 2.35, 2.42 and 2.43. 


2.44 Theorem. For all P, Q, and X, if X does not occur freely in P, then - 
tIVX(P)] = [VX(Q)]} > {VX[(P) > (Q)]}. 


Proof. Apply theorems 2.35, 2.42 and 2.43. oO 


2.45 Theorem (derived rule). For all P, Q, and X, if (P) © (Q), then 
F [VX(P)] <> [VX(Q)]. 


Proof. Apply Generalization, theorem 2.43, and Detachment: 


F (P) = (Q) hypothesis, 

F WX[(P) + (Q)] Generalization, 
F {VX[(P) <> (Q)]} = {[VX(P)] <= [VX(Q)]} theorem 2.43, 
F [VX(P)] = [VX(Q)] Detachment. 


oO 


Theorems 2.46 and 2.47 show that 4 could be defined in terms of V and 
double negation, or vice versa, provided that axiom QO includes the full Classical 
Propositional Calculus. 
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2.46 Theorem. For all P and X,+ [AX(P)] @ (>{[VX[-(P)]}). 
Proof. Apply the full propositional calculus and axiom Q3: 


F {-[AX(P)]} = {VX[-(P)]} axiom Q3, 
fF (={-[Ax(P)]}) ° (7{VX [-(P)]}}) contraposition and its converse, Oo 
F [AX(P)] = (7{VX [-(P)]}) double negation and transitivity. 


2.47 Theorem. For all P and X,+ [VX(P)]  (={[AX[-(P)]}). 
Proof. Apply the full propositional calculus and axiom Q4: 


F {A[VX(P)]} = {Ax[-(P)]} axiom Q4, 
FE (={-[VX(P)]}) 4 (={4X[-(P)]}}) contraposition and its converse, Oo 
+ [VX(P)] <> (={4X[-(P)]}) double negation and transitivity. 


2.48 Theorem (existential generalization). For all X, Y, and P, | [Subf(P)] > 
[AX(P)]. In particular, + (P) = [3X(P)].- 


Proof. Apply the propositional calculus with axioms Q1 and Q3: 
F {VX[-(P)]} = {Subfy[-(P)}} axiom QI, 
F {-X(P)t = {¥xX[-(P)]} axiom Q3, 
+ {Subf}[-(P)]} > {-[Subf}(P)]} remark 2.20, 
+ {=[3x(P)]} > {7[Subff(P)]} transitivity, 
+ [Subfy(P)] = [BX(P)] converse contraposition & Detachment. 
Oo 


Theorem 2.49 provides a converse to theorem 2.48 if X is not free in P. 
2.49 Theorem. For all P and X, if X is not free in P, then+ [AX(P)| = (P). 


Proof. Apply the propositional calculus with theorems 2.46 and 2.39: 
F [A(P)] => {VX[A(P)]} theorem 2.39, no free X in P, 
F (>{VX[-(P)]}) > {-[-(P)]}} — contraposition, 
F [AX(P)] > (—{VX[-(P)]}) theorem 2.46, 


F [AX(P)] > {-[-(P)]} transitivity, 
F {[-()}} = (P) double negation, 
F [AX(P)] = (P) transitivity. 
The converse is theorem 2.48. oO 


2.2.8 Other Axiomatic Systems for the Pure Predicate Calculus 


With the rules of Detachment and Generalization, and equivalent propositional 
calculi, Margaris [84, p. 49] and Rosser [110, p. 101] use the following axiom 
schemata for the predicate calculus: 


Axiom A4 {VX[(P) = (Q)]} > tIVx(P)] > [VX(Q)]}- 
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Axiom A5 [VX(P)] > [Subf}(P)]. 
Axiom A6 (P) => [VX(P)] if X does not occur freely in P. 


Margaris allows Generalizations only of axioms but proves a deduction theorem 
that then leads to the same rule of Generalization [84, p. 49]. 
In contrast, Kleene [72, p. 107] uses two axiom schemata 


V-schema [VX(P)] => [Subf;(P)], 
d-schema [Subf}(P)] > [3X(P)], 


paired with two inference rules, where X does not occur freely in P: 


V-rule from (P) > (Q) infer (P) > [VX(Q)], where X is not free in P, 
d-rule from (P) => (Q) infer [AX(P)] > (Q), where X is not free in Q. 


In both systems, 4X(P) is merely an abbreviation for ={VX[-(P)]}, as also in 
the systems of Church [18, p. 171] and Stoll [122, p. 115]. Reversely, in other 
systems VX(P) is merely an abbreviation for —{4X[—(P)]}, for instance, in Kunen’s 
[74, p. 3]. Axioms such as Q3 and Q4 partially dissociate the quantifiers from the 
axiom(s) for the negation in the selected propositional calculus. 

Also, yet another way to define substitutions of free variables consists in 
performing substitutions with a different procedure, as outlined in definition 2.50. 


2.50 Definition (proper substitution of free variables). A proper substitution 
of a variable Z for each free occurrence of a different variable X in a logical formula 
P consists of the following three steps: 


(1) Identify a variable that does not occur in P, for example, Y. 
(2) In P, replace each bound occurrence of Z by Y. 
(3) Then replace each free occurrence of X by Z. 


Steps (1) and (2) produce a change of bound variables according to definition 2.13. 
After step (2), P no longer contains any bound occurrence of Z, and hence no strings 
of the form VZ or 4Z. Consequently, P now admits Z for X in step (3). 


Definitions 2.16 and 2.50 thus provide two ways to avoid the phenomenon in 
counterexample 2.10. Both ways lend themselves to the same notation. 


2.51 Definition. The notation Subfy (P) states that P admits Z for X, and substitutes 
Z for each free occurrence of X in P. Alternatively, the same notation Subfs (P) 
denotes the proper substitution of Z for each free occurrence of X in P. The two 
alternatives are compatible, by remark 2.15 and definition 2.50. For convenience, 
Subf, (P) is defined to be P, and if X does not occur freely in P, then Subf% (P) is 
also defined to be P. The concept and notation for proper substitutions also apply to 
the substitution of a constant for a free variable. Because constants cannot appear 
immediately after a quantifier, they are not bound. Consequently, only the last step 
applies to the proper substitution of constants for free variables. Thus, Subf¥, (P) 
merely substitutes @ for every free occurrence of X in P. 


2.52 Example. For P consider the formula (WX{AZ[A(X = Z)}}) V (X = @). 
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(1) Verify that the variable Y does not occur in P. 

(2) Replace the bound occurrences of Z by Y, which gives the formula 
(VX{4Y[-(X = Y)])) v (X = 9). 

(3) Replace the free occurrence of X by Z, which gives the formula 
(WX{4Y[-(X = Y)]}) v (Z = @) for Subf? (P). 


In contrast, substituting the constant @ for every free occurrence of X in P yields 
(WX{4Z[-(X = Z)]}) v (@ = @) for Subf5 (P). 


2.53 Example. A situation like that in definition 2.50 occurs with computer algo- 
rithms to swap two variables X and Z, which typically use a third variable Y distinct 
from X and Z as a temporary storage. First, the algorithm assigns X to Y, an 
operation denoted by Y := X. Second, the algorithm assigns Z to X, an operation 
denoted by X := Z. Finally, the algorithm assigns Y to Z, an operation denoted by 
Z:=Y. 


2.2.9 Exercises on Kleene’s, Margaris’s, and Rosser’s Axioms 


The following exercises show that Margaris’s and Rosser’s axioms A4—A6 are 
derivable from the rules of inference with axioms QI-—Q4 and the Classical 
Propositional Calculus. 


2.11. Prove that the abbreviation 4X(P) for —{VX[-=(P)]} is derivable from the 
rules of inference with axioms Q1—Q4 and the Classical Propositional Calculus. 


2.12. Prove that Margaris’s and Rosser’s axioms A4, A5, and A6 are theorems 
derivable from the rules of inference with axioms QI-—Q4 and the Classical 
Propositional Calculus. 


The following exercises show that axioms Q1—Q4 are derivable from Margaris’s 
and Rosser’s axioms A4—A6 and the Classical Propositional Calculus. 


2.13. Prove that axiom Q2 is a theorem derivable from the rules of inference with 
Margaris’s and Rosser’s axioms A4—A6 and the Classical Propositional Calculus. 


2.14. Prove that axiom Q1 is a theorem derivable from the rules of inference with 
Margaris’s and Rosser’s axioms A4—A6 and the Classical Propositional Calculus. 


2.15. Prove that axiom Q4 is a theorem derivable from the rules of inference with 
Margaris’s and Rosser’s axioms A4—A6 and the Classical Propositional Calculus. 


2.16. Prove that axiom Q3 is a theorem derivable from the rules of inference with 
Margaris’s and Rosser’s axioms A4—A6 and the Classical Propositional Calculus. 


The following exercises show that Kleene’s schema and rules are derivable from 
the rules of inference with axioms Q1—Q4 and the Classical Propositional Calculus. 
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2.17. Prove that Kleene’s i-rule is derivable from the rules of inference with 
axioms Q1—Q4 and the Classical Propositional Calculus. 


2.18. Prove that Kleene’s V-rule is derivable from the rules of inference with 
axioms Q1—Q4 and the Classical Propositional Calculus. 


2.19. Prove that Kleene’s 4-schema is derivable from the rules of inference with 
axioms Q1—Q4 and the Classical Propositional Calculus. 


2.20. Prove that Kleene’s V-schema is derivable from the rules of inference with 
axioms Q1—Q4 and the Classical Propositional Calculus. 


2.3 Methods of Proof for the Pure Predicate Calculus 


If other considerations guarantee that a well-formed formula P has a proof but do not 
produce any proof of it, then writing down all the proofs of the predicate calculus, 
for instance, in increasing order of complexity, eventually yields among all such 
proofs a proof of P [18, p. 99-100, footnote 183]. However, if the shortest proof of 
P is very long, then this method may take longer than the time available to the user 
to arrive at any proof of P. Thus for all practical purposes this method may also fail 
to determine whether a formula is a theorem. 

The problem of deciding whether a well-formed formula is a theorem, derivable 
from specified axioms and inference rules, is called the decision problem. For the 
pure predicate calculus, no algorithms can provide a step-by-step recipe applicable 
to all well-formed formulae to determine whether any such formula is a theorem, as 
proved by Church [16, 17]. Nevertheless, methods exist to help in deciding whether 
a well-formed formula is a theorem. 

Trial and error is an option [114, p. 31], sometimes working backward from the 
particular well-formed formula as a final goal, or forward from the axioms, inference 
rules, and previous theorems as starting points or intermediate steps [72, p. 54-55]. 
The methods presented in this section guide this method of designing proofs. 


2.3.1 Substituting Equivalent Formulae 


One method of proof consists of replacing any occurrence of a formula by an 
equivalent formula, thanks to theorem 2.54 [18, p. 101, 124, 189], [108, p. 48]. 


2.54 Theorem (Substitutivity of Equivalence in the Pure Predicate Calculus, 
preliminary version). For all well-formed logical formulae U and V, ifF (U) @ 
(V), and if a formula Q results from substituting any (zero, one, several, or all) 
occurrence(s) of U by V in a well-formed formula P, then (P) = (Q). 
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Proof (Outline). Theorems 1.29 and 1.46 have already established the conclusions 
for logical implications and negations. 

In all cases, if Q is P, which results by substituting none of the occurrences of U 
by V, then (P) } (Q) is (P) } (P), which is theorem 1.63. 

For the universal quantifier, if P is VX(U), then Q is either VX(U) or VX(V). 
If Q is VX(V), then P with (U) = (V) yield Q, by theorem 2.37, and conversely, Q 
with (V) = (U) yield P, by theorems 2.37, or also by theorem 2.45. 

For the existential quantifier, theorem 2.46 reduces to the previous cases a 
formula P of the form 4X(U). 

The general case follows by several applications of the previous case and the 
cases in theorems 1.29 and 1.46, in a way that may be specified more explicitly 
after the availability of the Principle of Mathematical Induction in chapter 4. Oo 


2.55 Example. Let P denote the formula VX([AY(Y € X)] v {VZ[-(Z € X)]}), U 
the formula JY(Y € X), and W the formula —{VY[-=(Y € X)]}. Then (U) & (W) 
by theorem 2.46. Moreover, let V denote the formula —{VZ[-(Z € X)]}. Because 
Z does not occur in —(Y € X), theorem 2.31 shows that (W) <= (V). Hence 
(U) < (V) by transitivity. Consequently, (P) <> (Q) where Q denotes the formula 
WX([-{VZ[-(Z € X)}}] V {YZ[->(Z € X)]}), which is VX([={V}] v {V}). Since 
(V) v [=(V)] is a theorem, by Generalization Q and hence P is also a theorem. 


2.3.2 Discharging Hypotheses 


A method to design a proof of an implication (H) = (C) consists of first designing 
a derivation H | C of C from H, and then transforming the derivation H | C into a 
proof of (H) = (C) by theorem 2.56 [108, p. 46-47]. 


2.56 Theorem (Deduction Theorem, preliminary version). With any axiom sys- 
tem for which axioms P1, P2, and (P) => (P) are axioms or theorems (or schema 
thereof), and for every derivation H + C of a formula C from a formula H with 
the propositional calculus, the rules of inference, and axioms QI—Q4, but without 
Generalization on any free variable in H, there exists a proof of (H) => (C). 


Proof (Outline). Every step of the derivation H | C is a formula, denoted here 
by S. If S is H, or an axiom, or results from two previous steps and Detachment, 
then the proof of the deduction theorem 1.22 for the Pure Positive Implicational 
Propositional Calculus shows how to replace S by (H) => (S). 

If S is a Generalization VX(R) of a previous step R with a variable X that does 
not occur freely in H, then R has already been replaced by (H) => (R). Hence 
Generalization gives VX|(H) = (R)], whence, because X does not occur freely in 
H, theorem 2.29 yields a proof of (H) => [VX(R)], which is (H) => (S). 

The general case follows by several applications of the previous cases in a 
way that may be specified more explicitly after the availability of the Principle of 
Mathematical Induction in chapter 4. Oo 
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Thus, the selection of axioms PI—P3 and Q1I-Q4 leads to the Deduction 
Theorem (2.56) more directly than would other selections of otherwise equivalent 
axioms [108, p. 47]. In practice, however, a derivation of H + C of C from H 
may already suggest other logical steps that shortcut or bypass the entire procedure 
outlined in the proof of the Deduction Theorem 2.56. 

To demonstrate such shortcuts, the following theorems provide means for 
bringing quantifiers to the “front” of a formula. For example, axioms Q3 and Q4 
with theorem 2.54 already allow the replacement of —[4X(P)] by VX[=(P)], and of 
—[VX(P)] by 4X[-(P)]. In an implication (R) = (S), each of R and S can be of the 
form (P), or VX(P), or 4X(P), starting with V, or 4, or no quantifiers, which leads 
to nine different cases. In the case where neither R nor S begins with a quantifier, 
then no quantifiers need to be brought to the front of (R) => (S). The other eight 
cases form the object of the following theorems. 

Theorem 2.57 handles a case where R is 1X(P) while S is Q. 


2.57 Theorem. /f X does not occur freely in Q, then 
F {WX[(P) => (Q)]} = {AX(P)] > (Q)}- 


Proof. Let H denote VX[(P) => (Q)], and let C denote [AX(P)] > (Q). 


- VX[(P) = (Q)] hypothesis, 

F {¥X[(P) = (Q)]} = [(P) = (Q)] specialization (Q1), 

r (P) = (Q) Detachment, 

F [-(Q)] > [-()] contraposition, 

F VX{[=(Q)] > [F(P)]} Generalization, 

F [-(Q)] = {VX[-(P)]} theorem 2.29, no free X in Q, 

F (>{VX[-(P)]}) > (-[-(@)} contraposition, 

+ [AX(P)] > (Q) double negation (1.42) and 2.46. 


Hence the Deduction Theorem (2.56) leads to a proof of {VX[(P) > (Q)]} > 
{[HxX(P)] => (Q)}. Yet the foregoing derivation suggests shortcuts: 


F {VX[(P)= (Qi [P)>(Q)] axiom Q1, 

F [(P)>(Q)|>tI-O)l= Fy] contraposition, 
F {¥X[(P)= (OS tI-(O)|>F@P)]} transitivity, 

F {VX[(P)=>(O)]}> (¥X{[-(Q)]>[-(P)]}) theorem 2.30, 
F (WX{[-(Q)]>[-P)}}) > (FOQ)1> tvxX[-)]}) theorem 2.29, 
F ([-(Q))>{¥X[-@)]}}) = (-{VX[-@)]}) > {(--(@)]}__ contraposition, 
b (AVX )]})>(--O}>{EXP)]=>(Q)} 1.42, 2.46, 

F {¥X[(P)=> (Q)]}=>{[3x(P)]= (Q)} transitivity. 


For the converse, let H denote [AX(P)] => (Q), and let C denote VX[(P) > (Q)]. 
- [AX(P)] => (Q) _ hypothesis, 
- (P) => [AX(P)] _ theorem 2.48, 
F (P) > (Q) transitivity, 
- WX[(P) = (Q)| Generalization. 
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Again the Deduction Theorem (2.56) leads to a proof of {[AX(P)] > (Q)} > 
{VX[(P) => (Q)]} but the foregoing derivation suggests shortcuts: 
F (P) => [AX(P)] theorem 2.48, 
+ ({[AX(P)] > (Q)}) > [(P) > (Q)] theorem 1.34, 
+ ({{Ax(P)] > (Q)}) > {VX[(P) > (Q)]} theorem 2.30. 


oO 


The proofs of the following theorems emerge from similar outlines, starting with 
a derivation H F C of C from H, and transforming it into a proof of F (H) => (C) 
by shortcuts suggested by the derivation or by the Deduction Theorem (2.56). To 
this end, the following derived rule proves useful. 


2.58 Theorem (derived rule). /f X is not free in Q, and if (P) => (Q), then 
F [AX(P)] = (Q). 


Proof. Apply theorem 2.57: 


F (P) > (Q) hypothesis, 
F VX[(P) => (Q)] Generalization, 
F {VX[(P) > (Q)]} => {[3X(P)] => (Q)} _ theorem 2.57, no free X in Q, 
F [AX(P)] > (Q) Detachment. 
oO 
2.59 Theorem. [f X does not occur freely in P, then (P) <> [AX(P)]. 
Proof. Apply theorems 2.48, 1.12, and 2.58. 
+ (P) => [AX(P)] theorem 2.48, 
F (P) => (P) theorem 1.12, 
+ [AX(P)] = (P) _ theorem 2.58, no free X in P. 
oO 


2.60 Theorem. For all P, Q, and X, + {VX[(P) => (Q)]} => {[AxX(P)] = 
[AX(Q)]}. 


Proof. Let H denote VX[(P) => (Q)], and let C denote [AX(P)] => [AX(Q)]. 


F {VX[(P) > (Q)}} > [(P) > @] axiom Q1, 
F [(P) > (Q)] > t-@)] = FFI} contraposition, 
F {VX[(P) > (QB > {F-@] > [-@)} transitivity, 


F {WX[(P) = (Q)]} = (YX{I-(@)] = [-(P)}}) theorem 2.30, 
F (WX{[-(Q)] = [-(P)]}) 


=> ({VX[=(Q)]} > {VX[-(P)}) theorem 2.35 
F ({¥X[-(Q)]} > {VX[-(P)]}) 

= [({VX-()}) = (-f¥x(-(@)})] contraposition, 
F [(-tVX[-(P)}}) > (-tVX1-(2)}3)] 

=> {[3x(P)] > [Ax(Q)]} 1.42, 2.46, 


F {VX[(P) > (Q)]} => {[3Xx(P)] => [AX(Q)]} transitivity. 
oO 


2.61 Theorem (derived rule). /f (P) <> (Q) is a theorem, then [AX(P)]| 
[3X(Q)] is a theorem. 
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Proof. Apply Generalization, theorem 2.60, and Detachment: 


F (P) = (Q) hypothesis, 
 WX[(P) + (Q)] Generalization, 
F {VX[(P) > (Q)]} => {[AX(P)] => [AX(Q)]} theorem 2.60, 
F [AX(P)] > [Ax(Q)] Detachment. 


2.62 Theorem. | {AX[(P) > (Q)]} > {[VX(P)] => [A3x(Q)]}. 


Proof. This proof follows Church’s [18, p. 205]. Apply the law of denial of the 
antecedent (theorem 1.40) and the law of proof by cases subject to hypotheses 
(theorem 1.50): 


F [-(P)] = [(P) = (Q)] theorem 1.40, 

-F VX{[A=(P)] => [(P) > (Q)]} Generalization, 

F {Ax[-(P)]} => {Ax[(P) => (Q)]} theorem 2.60 and Detachment, 
F {A[VX(P)} => {Ax[(P) => (Q)]} axiom Q4 and Detachment, 

F (Q) = [(P) > (Q)] axiom Pl, 

F V¥X{(Q) => [(P) => (Q)]} Generalization, 

F [AX(Q)] = {Ax[(P) > (Q)]} theorem 2.60 and Detachment, 


F {[VX(P)] = [Ax(Q)]} > {AX[(P) > (Q)]} theorem 1.50 and Detachment. 
For the converse, use theorems 2.48 and 2.57: 


F (P) => {[(P) >= (Q)] > (Q)} law of assertion (1.38), 
F [VX(P)] = (P) axiom Q1, 

r [VX(P)] = {[(P) = (Q)] > (Q)} transitivity, 

F [(P) > (Q)] => {[VX(P)] > (Q)} commutation (1.37), 

F (Q) = [Ax(Q)] theorem 2.48, 

F [(P) => (Q)] => {[VX(P)] => [2X(Q)]} derived rule, 


fF VX([(P) => (Q)] => {[VX(P)] => [3X(Q)]}}) Generalization, 
F {AX[(P) > (Q)]} > {[VX(P)] => [BX(Q)]} theorem 2.57 and Detachment. 


Oo 
2.63 Theorem. [f X does not occur freely in P, then 
F {AX[(P) => (Q)]} > {(P) = [AX(Q)I}. 
Proof. This proof follows Church’s [18, p. 205]. Apply theorems 2.39 and 2.62: 
F {AX[(P) > (Q)]} > {[VX(P)] > [3X(Q)]} theorem 2.62, 
F (P)  [VX(P)] theorem 2.39, no free X in P, 
F {Ax[(P) > (Q)]} @ {(P) => [AX(Q)]} derived rules. 
oO 


Similar to theorem 2.57, theorem 2.64 handles a case where R is VX(P) while 
Sis Q. 


2.64 Theorem. /f X does not occur freely in Q, then 
F {WX[(P) > (Q)]} > {IVX(P)] > (Q)}. 
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Proof. Apply theorems 2.59, 2.62, and 2.54: 


F (Q) = [Ax(Q)] theorem 2.59, no free X in Q, 
F {AX[(P) => (Q)]} = {[VX(P)] = [3X(Q)]} theorem 2.62, 
F {VX[(P) > (Q)]} > {[VX(P)] > (Q)} theorem 2.54. 

Oo 


2.3.3 Prenex Normal Form 


Yet another method of proof consists in transforming a formula into an equivalent 
formula in which all the quantifiers, if any, are at the beginning. 


2.65 Definition. A formula Q is in prenex normal form if and only if Q is of the 
form 


O,X>OpXbp . . . Dp...pXb...b(R) 


optionally with brackets and parentheses, where R is a well-formed formula without 
quantifiers and each string Q,. is either V or 4. The formula R is called the matrix 
of P while the string Q,X,Q),Xpp . . . Qp..pXp...) 18 called the prefix of P. 


2.66 Example. The formula VXAZ[—=(Z = X)] is in prenex normal form. Its prefix 
is VXAZ while its matrix is —(Z = X). 


Theorem 2.67 reveals that every well-formed formula is equivalent to a formula 
in prenex normal form [18, §39], [108, p. 49]. 


2.67 Theorem (prenex normal form, preliminary version). For every well- 
formed formula P there exists a well-formed formula Q in prenex normal form such 


that (P) <> (Q). 


Proof (Outline). In P, replace bound variables so that different quantifiers bind 
different variables, which gives a formula equivalent to P, by theorems 2.31 
and 2.54. 

With different quantified variables, theorem 2.38 then provides a means to bring 
quantifiers in front of an implication. 

Axioms Q3 and Q4 also provide a means to bring any quantifier in front of any 
negation. 

The general case follows by several applications of the previous cases in a 
way that may be specified more explicitly after the availability of the Principle of 
Mathematical Induction in chapter 4. Oo 


2.68 Example. The formula —{4X|[VZ(Z = X)]} is not in prenex normal form. 
Nevertheless, axiom Q4 gives the equivalent formula VX{—[VZ(Z = X)]}, whence 
axiom Q3 and theorem 2.54 yield the equivalent formula VX{4Z[-=(Z = X)]}, which 
is in prenex normal form. 
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Transforming a logical formula P into an equivalent formula Q in prenex normal 
form, or partly so, may reveal a proof of Q and hence also of P, as demonstrated in 
example 2.55. Example 2.69 completes the transformation into prenex normal form, 
which reveals a propositional theorem in the matrix. 


2.69 Example. The formula VX([AY(Y € X)] v {VZ[-(Z € X)]}) is not in prenex 
normal form. Nevertheless, by the definition of V in terms of — and => the formula 
becomes VX({=[AY(Y € X)]} > {VZ[-(Z € X)]}). 

Axiom Q3 gives VX({VY[-(Y € X)]} > {VZ[-(Z € X)]}). 

Theorem 2.38 gives VX[VZ({VY[-(Y € X)]} => [-(Z € X)])]. 

Theorem 2.64 yields VX[VZ(A4Y{[-(Y € X)] => [-(Z € X)]})], which is a 
theorem: selecting Z for Y gives [—=(Z € X)] = [—(Z € X)], which has the pattern 
(P) => (P) of theorem 1.12. 


Besides providing transformations that may facilitate proofs, as in example 2.55, 
bringing formulae into prenex normal form, in particular, Skolem’s normal form 
with all the existential quantifiers preceding all the universal quantifiers, 


aX, ... 5X, pWY?... VY A(R), 


leads to Gédel’s Completeness Theorem, that a formula is a theorem if and only if 
it is valid in all applications, even though no mechanical ways to check either may 
exist [18, §42-§44]. 


2.3.4 Proofs with More than One Quanitifier 


The following theorems are examples of theorems involving more than one quanti- 
fier. The first theorem allows for the deletion of a redundant universal quantifier. 


2.70 Theorem. | [VX(Q)] @ {VX[VX(Q)]} . 

Proof. Apply theorem 2.39 to VX(Q), which has no free X. Oo 
The second theorem allows for the swap of two consecutive universal quantifiers. 

2.71 Theorem. | {VX[VY(P)]} @ {VY[VX(P)]}. 


Proof. Apply axiom Q1, theorem 2.29, and Generalization: 


F {VX[VY(P)]} > [VY(P)] axiom Q1, 

F [VY(P)] => (P) axiom QI], 

-F {VX[VY(P)]} > (P) transitivity (theorem 1.16), 
F {VX[VY(P)]} > [VX(P)] theorem 2.37, 


F {VX[VY(P)]} > {VY[VX(P)]} theorem 2.29. 
Oo 


The third theorem allows for the swap of two consecutive existential quantifiers. 
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2.72 Theorem. | {AX[AY(P)]} = {AY[AX(P)]}. 


Proof. Apply the complete law of double negation, axiom Q3, and theorem 2.71: 
AX[AY(P)] 

¢ double negation, 

=(={4X[AY(P)]}) 
{axiom Q3 and theorem 2.54, 
-(VX{-[BY(P)]}) 
{axiom Q3 and theorem 2.54, 
3(VX{VY[-(P)]}) 
{theorem 2.71, 
a(VY{VX[-(P)}) 
¢ axiom Q3 and theorem 2.54, 
=(VY{-[Ax(P)]}) 
¢ axiom Q3 and theorem 2.54, 
=(-{4Y[AX(P)]}) 
¢ double negation (theorems 1.41 and 1.42). 
AY[AX(P)] 


oO 
The fourth theorem allows for the swap of different quantifiers in an implication. 
2.73 Theorem. | {A4X[VY(P)]} > {VY[HX(P)]}. 
Proof. Apply theorems 2.48, 2.35, and 2.57: 
t (P) => [AXxX(P)] theorem 2.48, 


- [VY(P)] => {VY[Ax(P)]}} theorem 2.37, 
- {AX[VY(P)]} => {VY[AX(P)]} theorem 2.57, no free X in {V Y[AX(P)]}. 


2.74 Counterexample. The converse of theorem 2.73, which would be 
{VY[AX(P)]} => {AX[VY(P)]}. 


can be False. For instance, in every context with at least two different objects V and 
W, consider the logical formula X = Y for P. 
F VY[AX(X = Y)] for each Y, choose X := Y; 
AX[VY(X = Y)| is False: no X equals V and W; 
{VY[AX(X = Y)]} > {AX[VY(X = Y)]} | is False because (T) # (F). 


2.3.5 Exercises on the Substitutivity of Equivalence 


The following exercises focus on details of the proof of theorem 2.54, with the 
logical equivalence <> defined either by Tarski’s axioms IV, V, VI in example 1.87 
on page 55, or with = and A in definition 1.51 on page 38. 
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2.21. Prove that if P denotes (U) => (W), if Q denotes (V) > (W), and if F 
(V) = (U), then+ (P) => (Q). 


2.22. Prove that if P denotes (W) = (U), if Q denotes (W) = (V), and if F 
(V) = (U), then (P) > (Q). 


2.23. Prove that if P denotes (U) = (W), if Q denotes (V) => (W), and if F 
(V) = (U), then! (Q) => (P). 


2.24. Prove that if P denotes (W) = (U), if Q denotes (W) = (V), and if F 
(V) > (U), then (Q) => (P). 


2.25. Prove that if P denotes VX(U), if Q denotes VX(V), without free occurrences 
of X in U and V, and if F (V) = (U), then (P) > (Q). 


2.26. Prove that if P denotes VX(U), if Q denotes VX(V), without free occurrences 
of X in U and V, and if (V) = (U), then! (Q) => (P). 


2.27. Prove that if P denotes =(U), if Q denotes —=(V), and if F (V) = (U), then 
F (P) = (Q). 
2.28. Prove that if P denotes —(U), if Q denotes —(V), and if F (V) = (U), then 
F (Q) => (P). 


2.29. Prove that if P denotes either (U) = (W) or (W) => (U), if Q denotes either 
(V) => (W) or (W) => (V), respectively, and if F (V) = (U), then' (P) > (Q). 


2.30. Prove that if P denotes =(U), if Q denotes =(V), and if F (V) = (U), then 
F (P) <= (Q). 


2.4 Predicate Calculus with Other Connectives 


This section introduces theorems with quantifiers and conjunctions or disjunctions. 


2.4.1 Universal Quantifiers and Conjunctions or Disjunctions 


This subsection presents theorems involving the universal quantifier (V) and a con- 
junction (A) or disjunction (V), beginning with an equivalence with a conjunction. 


2.75 Theorem. + {VX[(P) A (Q)]} = t[VX(P)] A [VX(Q)]}. 
Proof. Apply theorems 1.53, 2.37, 1.52, 1.55: 
F [(P) A (Q)| => (P) theorem 1.53, 


F {VX[(P) A (Q)]} => {VX(P)} theorem 2.37, 
F [(P) A (Q)] => (Q) theorem 1.52, 
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F {WX[(P) A (Q)]} => tVX(Q)} theorem 2.37, 
F {WX[(P) A (Q)]} = {[VX(P)] A [VX(Q)] theorem 1.55. 
oO 
The converse implication forms the object of the following theorem. 
2.76 Theorem. | {[VX(P)] A [VX(Q)]} > {VX[(P) A (Q)]}. 
Proof. Apply axiom Q1 with theorems 1.82 and 2.30: 
F [VX(P)] = (P) axiom Q1, 
F [VX(Q)] = (Q) axiom Q1, 
F {[VX(P)] A [VX(Q)]} = [(P) A (Q)] theorem 1.82, 
F {[VX(P)] A [VX(Q)]} => {VX[(P) A (Q)]} theorem 2.30. 
oO 
The following theorem gives an implication with a disjunction. 
2.77 Theorem. | {[VX(P)] v [VX(Q)]} => {VX[(P) v (Q)]}. 
Proof. Apply axiom Q1, exercise 1.57, theorem 2.29, and Generalization: 
F [VX(P)] = (P) axiom Q1, 
F [VX(Q)] > (Q) axiom Q1, 
F {[VX(P)] v [VX(Q)]}} = [(P) v (Q)] exercise 1.57, 
fF WX({[VX(P)] Vv [WX(Q)]} > [(P) v (O)]) Generalization, 
F {[VX(P)] v [VX(Q)]} => {VX[(P) v (Q)]} theorem 2.29. 
oO 


2.78 Counterexample. The converse of theorem 2.77, which would be 


tVX[(P) Vv (Q)]} => tIVX(P)] v [VxX(Q)]}. 


may be False. For instance, in every context with exactly two different objects V and 
W, consider the formulae X = V for P and X = W for Q: 


F VX[(X = V) Vv (X = W)] because either (X = V) or (X = W); 
VX(X = V) is False if X := W; 

VX(X = W) is False if X := V; 

[VX(X = V)] Vv [VX(X = W)] is False by the preceding two lines; 


{VX[(X = V) v (xX = W)]} 
=> {[VxX(X = V)] v [VX(X = W)]} _ is False because (T) # (F). 
However, theorem 2.79 shows a converse of theorem 2.77 in a particular case. 


2.79 Theorem. [f P has no free X, thent {VX[(P) v (Q)]} = {(P) v [VX(Q)]} and 
F {VX[(P) v (Q)]} = t[VX(P)] v [VX(Q)]} 
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Proof. Apply the definition of v: 
VX[(P) Vv (Q)] 

¢ definition of (P) v (Q), 
VxX{[(P) => (Q)] = (Q)} 
4) theorem 2.35, 
{VX[(P) => (Q)]} = [VX(Q)] 
{theorems 2.38, 2.54, no free X in P, 
{(P) = [VX(Q)]} = [VX(Q)] 
{definition of v, 
(P) v [VxX(Q)] 
{theorem 2.39, 2.54, no free X in P. 
[VX(P)] v [VX(Q)] 


2.4.2 Existential Quantifiers and Conjunctions or Disjunctions 


This subsection presents theorems involving the existential quantifier (4) and a con- 
junction (A) or disjunction (V), beginning with an equivalence with a disjunction. 


2.80 Theorem. | {[3X(P)] v [SX(Q)]} © {[SX[(P) v (O)}}. 


Proof. Apply contraposition with theorems 2.75, 2.76, and 2.45: 
F (WX{I-(P)] 4 [-(Q)3) & ((VXIF@P)B A (YX[-(@)}) 2.75, 2.76, 
¢  contraposition, 
[-(EVXI-@P) A {YXI-O)B)] & [(VXtIF@)1 4 FO] 
¢ 2.45, 
[AVX @)}) Vv (VX) ] & [-(VXt-1P) v (Q)]3)] 
¢t axiom Q4, 
(={-18X(P)}}) v (Af [X(Q))}}) & [-(-12XL) v (Q)}})] 
¢ double negations. 
{Bx(P)] v AX(Q)]} @ tax) v (QI 
A similar equivalence with a conjunction requires that X be not free in P. 
2.81 Theorem. /f P has no free X, then {AX[(P) A (Q)]} = {(P) A [AX(Q)]}. 


Proof. Apply the full propositional calculus with theorems 2.38, 2.61, 2.46, and 
axiom Q4, and theorem 1.69: 
AX[(P) A (Q)] 
¢ double negation, 
AX(—{-[(P) 0 (Q)]}}) 
¢ De Morgan’s first law and theorem 2.61, 
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ax(={[-()] v [-(Q)]}) 

{= definition of v, 
Ax (={(P) = [-(Q)]}) 

tt axiom Q4, 

>(VX{(P) = [-(Q)}}) 

¢ theorem 2.38, 

-[) = {vx[-@)]}}] 

¢ definition of V by theorem 1.69, 

=(-@)] v (YX) 

¢¢ De Morgan’s second law and double negation, 

(P) A (={¥X[-()}}) 

¢ theorem 2.46. 

(P) A [AX(Q)] 


2.4.3 Exercises on Quantifiers with Other Connectives 


For the following exercises, prove that the stated formulae are theorem schema. 
2.31. {AX[(P) V (Q)]} > {AX[(Q) v (P)]}. 

2.32. {VX[(P) A (P)]} > {VX(P)}. 

2.33. {AX[(P) v (P)]} > {AX(P)}. 

2.34. (AX{[(P) v (Q)] V (R)}) & X{(P) Vv [(Q) V BD}. 

2.35. (VX{[(P) A (Q)] Vv (R)}) > AVX[(P) Vv (R)]} A tYXT(Q) V (RI). 
2.36. (AX{[(P) V (Q)] A (R)}) > CAX[(P) A (R)]F Vv t5(Q) A (RY). 
2.37. [AX(Q)] = [AX (4X(Q))]. 

2.38. If P has no free X, then {(P) A [VX(Q)]} = {VX[(P) A (Q)]}. 
2.39. If Phas no free X, then {(P) v [VX(Q)]} = {VX[(P) v (Q)]}. 
2.40. If P has no free X, then {(P) v [AX(Q)]} = {AX[(P) v (Q)]}. 


2.5 Equality-Predicates 


Applications of logic, for instance, algebra, arithmetic, and geometry, may include 
concepts of “equality” that allow for substitutions of mutually equal objects in 
statements and formulae, which results in mutually equivalent statements and 
formulae. 


www.pdfgrip.com 
102 2 First-Order Logic: Proofs with Quantifiers 
2.5.1 First-Order Predicate Calculi with an Equality-Predicate 


Different applications may define equality differently [8, p. 6-7]. For instance, in 
some versions of integer arithmetic, the equality a = b means that a and b are 
two symbols for one integer [25, p. 44], [76, p. 1]. In contrast, in some versions 
of set theory, the equality A = B means that A and B denote sets with identical 
set-theoretical features: they have the same elements, and they are elements of the 
same sets [8, 35, p. 6-7]; the question whether A and B denote the same set does 
not arise in the theory. Nevertheless, such different concepts of equality happen to 
conform to a logical predicate, denoted by .Y to suggest identity, subject to the 
following axioms (which might also be called postulates to distinguish them from 
logical axioms) [18, § 48]. 


Axiom .71 (reflexivity of equality) / .7(X, X). 


Axiom -#2 (substitutivity of equality) | [.7(X, Y)] > [(P) > (Q)] for all well- 
formed formula P and Q such that Q results from the substitution of any one free 
occurrence of X in P by Y, provided that the resulting occurrence of Y is also free, 
or, in other words, provided that in P this occurrence of X is not within the scope of 
a quantifier (VX, VY, 4X, AY) bounding X or Y. 


The condition stipulated in axiom .#2 is similar to the requirement that P admit 
Y for X, or that Y be free for X, but only for one particular occurrence of X in P. 

Using only the Pure Positive Implicational Propositional Calculus, theorems 2.82 
and 2.83 show that every predicate .Y satisfying axioms .%1 and .#2 is symmetric 
and transitive [84, p. 104]. 


2.82 Theorem (symmetry of equality). | [.7(X, Y)] > L7(Y, X)]. 


Proof. In axiom .¥2, substitute the terms .7(X, X) for P and .g(Y, X) for Q: 
F A(X, Y)] => {[4(X%,X)] = [4(Y,X)]} axiom 72, 
 .£(X,X) axiom .Z1, 


F [.F(X, Y)] > [4,X)] derived rule (theorem 1.15). 
oO 


2.83 Theorem (transitivity of equality), | [.¢7(X,Y)] => {L4(v,Z2] => 
[.4(X, Z)]}. Hence, if (X,Y) andt Y(Y,Z), thenl g(X,Z). 


Proof. Use the symmetry of equality (theorem 2.82) and axiom .#2: 
F [.A(X, Y)| > LF, X)] theorem 2.82, 
I LA(Y,X)] > {L4(¥,Z)] > L4(X,.Z)]} axiom 2, 
F [.A(X, Y)] > {L“(, Z)] => [.4(X, Z)]} transitivity (theorem 1.16), 


r (X,Y) hypothesis, 
F [.Y(Y,Z)] => [.4(X, Z)] Detachment, 
PP #(Y,2) hypothesis, 
t .g(X,Z) Detachment. 
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Using only the Pure Positive Implicational Propositional Calculus, theorem 2.84 
extends axiom .%2 to a converse implication, so that substituting mutually equal 
objects results in mutually equivalent formulae. 


2.84 Theorem (substitutivity of equality). - [.7(X, Y)] > [(P) } (Q)] for all 
well-formed formula P and Q such that Q results from the substitution of any one 
free occurrence of X in P provided that in P this occurrence of X is not within the 
scope of a quantifier (VX, VY,4X, 4Y) bounding X or Y. 


Proof. The implication F [.4(X, Y)] = [(P) > (Q)] is axiom .#2. 

For the converse, the hypothesis also states that in Q the resulting occurrence 
of Y is not within the scope of a quantifier (VX, VY,4X,4Y) bounding Y or X, 
which allows swapping X and Y, and swapping P and Q, in axiom .#2, so that 
F [.g(Y,X)] => [(Q) => (P)]. Hence the conclusion follows from the symmetry 
F [.4(X, Y)| > [.4(Y, X)] by theorem 2.82 and the transitivity of implication. O 


Repeated applications of theorem 2.84 and the proof of substitutivity of equiv- 
alence then show that substituting mutually equal objects in a formula leads to an 
equivalent formula. 


2.5.2 Simple Applied Predicate Calculi 
with an Equality-Predicate 


Some applications of logic might omit all propositional variables and instead have 
only atomic formulae with a few predicates, or perhaps only one predicate, which 
might be denoted by some constant &. Such applications are called simple applied 
predicate calculi. For instance, a version of set theory has no propositional variables 
and only one predicate, for set membership, so that £(X, Y) stands for X € Y. 
In such applications, an additional equality predicate .% allows for substitutions 
of mutually equivalent objects in statements and formulae if and only if .g is 
reflexive (a condition that replaces axiom .%1), symmetric, transitive, and satisfies 
the following two conditions, which replace axiom .¥2 [18, p. 283, exercise 48.3]: 


[.Z(A, B)] > {[6(X, A)] > [F(X, B)]}, 

[4(A, B)] > [1A Y)] = [6B YI. 
In applied logics with other predicates, for instance, predicates for the sum and 
products of integers in arithmetic, two similar conditions must be appended for each 
predicate to ensure the substitutivity of mutually equal objects. By the postulated 
symmetry of the equality predicate .g these two conditions are equivalent to 

[4(A, B)] > tl(X,A)] > [(X, BI}, 

[4 (A, B)] = {[é, Y)] > [&B, Y)]}. 
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These conditions suffice to ensure that if .7(A, B) holds, then substituting any free 
occurrence of A for any free occurrence of B according to the conditions stipulated 
by axiom .#2 in any formula P produces an equivalent formula Q, because well- 
formed formulae include only atomic formulae of the form &(Z, W). The proof 
follows the pattern of the proof of the substitutivity of equivalence. The resulting 
theorem is called the substitutivity of equality. 

In particular, if a simple applied predicate calculus has exactly one predicate, 
&, which is binary (involving exactly two individual variables), then the same 
conditions may serve to define an equality predicate .Y so that .7(A, B) is merely 
an abbreviation for 


(VX{[E(X,A)] © [(X, B)]}) A (VY{[E(A.Y)] 6 [(6B.Y). 2D 


Formula (2.1), abbreviated by .7(A,B), satisfies axiom .#2 and is reflexive, 
symmetric, and transitive, because so is the equivalence + in the full propositional 
calculus. In particular, the theorem on substitutivity of equality holds. An equality 
predicate .Y defined in this manner from another binary predicate thus does not 
add anything to the theory except convenience. 


2.85 Example (equality in set theory). In a version of set theory, with the single 
binary predicate £(Z, W), the postulate (or axiom) of extensionality states that 


(VxX([E.A)] + [6% B)}) & (VYLEA.Y] 6 6B). 22) 


The equality .7(A, B) of sets A and B is then an abbreviation of each of the formulae 
VX{[E(X, A)] > [F(X, B)]} and VY{[4(A, Y)] > [@(@, YI}. 


The following theorems confirm that every equality-predicate defined by for- 
mula (2.1) is reflexive, symmetric, and transitive. 


2.86 Theorem (reflexivity of defined equality-predicates). Every equality- 
predicate .Y (A, B) defined by formula (2.1) is reflexive: 


- VCLZ(C, C)]. 


Proof. One method to design a formal proof transforms the objective, here the 
yet unproved formula VC[.Z(C, C)], first into its defining formula (2.1), and then 
into logically equivalent formulae, for instance, in prenex form, until one such 
equivalent formula appears that is a theorem, thanks to an axiom or to a previously 
proven theorem. For instance, substituting C for A and also C for B in the defining 
formula (2.1) gives 
L(C,C) yet unproved, 
{= definition of .7 
(VX{[E(X, C)] [EK C}) A (VVC. Y)] + LEC. YH) 
{theorem 2.75, 
WXVY (LEX, ©] [FX OF A ((C.Y] + (6(C. YB) 
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where each logical formula [4(W, Z)] <> [€(W, Z)] has the pattern of the reflexivity 
of the logical implication (P) + (P) (theorem 1.63). Thus, a complete proof may 
proceed as follows: 


F (P)(P) theorem 1.63, 
F [&(X, C)]S[4(X, C)] substitution in (P)}(P), 
F [@(C, Y)][E(C, Y)] substitution in (P)<(P), 
F {[&(X, C)]S[F(X, CE} A {[4(C, YIS[E(C, Y)]} theorem 1.54, 
F g(C,C) formula (2.1). 

Hence + VC[.Z(C, C)] results by Generalization and theorem 2.75. Oo 


2.87 Theorem (symmetry of defined equality-predicates). Every equality- 
predicate defined by formula (2.1) is symmetric: if + .9(A,B), then .g(B,A); 
moreover. 


+ VAVBi[.Z(A, B)] => [.Z(B, A)]}. 


Proof. One method to design a formal proof transforms the objective, here the 
yet unproved formula +t VAVB{[.4(A, B)] = [.4(B,A)]}, first into its defining 
formula (2.1), and then into logically equivalent formulae, for instance, in prenex 
form, until one such equivalent formula appears that is a theorem, thanks to an 
axiom or to a previously proven theorem. Here an equivalence will emerge: 
[.Z(A, B)|<>[.4(B,A)] yet unproved, 
¢ definition of .% 

(VX{[E(X, A)] [6 (X, B)]}) A (VUE, VY) 1E(B, Y)]}) 
 (VX{[E(X, B)] > [E(X. A)}}) A (VY{TEB. Ye le. YB). 
which suggests invoking the symmetry of the logical equivalence [(P) @ (Q)] > 
[(Q) <= (P)] (theorem 1.64). Thus, a complete proof may proceed as follows: 


F [(P)S(Q)]S[(Q)S(P)] theorem 1.64, 
F {[@(X, A] [6 (X, B)]} <> {[E(X, B)]<>[E(X, AD} substitution, 
F [(R)S(S)]S[(S)S (R)] theorem 1.64, 
F {(A, YS[4B, YS {E(B YISl6(A, YI} substitution, 
LK [(P) S(O) A [(R)S (S(O) (P)] A [(S)4(R)]} theorem 1.82. 
Hence the conclusion results by Generalization and theorem 2.75. | 


2.88 Theorem (transitivity of defined equality-predicates). Every equality- 
predicate defined by formula (2.1) is transitive: if .4(A,B) and (B,C), 
then .g(A, C); moreover, 


F WAVBYC({L.Z(A, B)] A [.7(B, C)]} = L714, C)]). 
Proof. One method to design a formal proof transforms the objective, here the yet 


unproved formula F VAVBVC({[. (A, B)] A [g(B, C)]} = [.4(A, C)]), first into 
its defining formula (2.1), and then into logically equivalent formulae, for instance, 
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in prenex form, until one such equivalent formula appears that is a theorem, thanks 
to an axiom or to a previously proven theorem. 
tL 4 (A, B)] A L7(B, CE} => [4(4,©)] yet unproved, 
¢ definition of .7 
[(vxle%. Ale leX. By}}) A (WHE. Y1SleB, YB) 
A(VX[EK, DI lEX O) A (VHLER Nele(c. YN) 
> (VX[EX A] ole CH) A (WY{LEA YSleC. Y}). 
which suggests invoking the transitivity of the logical equivalence (theorem 1.65): 


(A) > (KI AK) & DI} = (A) > I. 
P) + (QA (Q) > (A) = [(P) > (R)], 


with the commutativity and associativity of the logical conjunction (theorems 1.57 
and 1.66) combined with theorem 1.82: 


({[(A) > (KA [(K) & DI} A {[(P) & (Q)] A [(Q) > (R)]}) 
=> {(A) > (LJ A [(P) > (R)]}- 


The conclusion results by Generalization and theorem 2.75. oO 


2.5.3 Other Axiom Systems for the Equality-Predicate 


Other axioms systems exist to specify the identity predicate. 


Axiom J 1 (reflexivity of equality) / VX .7(X, X). 


Axiom _/ 2 (substitutivity of equality) For every unary predicate variable or 
predicate constant F, involving only one individual variable, 


b VXVY(LA(X, ¥)] > {LFOO] > LF(Y)]}). 


For every binary predicate variable or predicate constant ¥, involving only two 
individual variables, 


b VXVYVWYZ{L9(X, Y)] => (L4(W.Z)] > {LF¥% W)] > [F(.2)]})}. 
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For every ternary predicate variable or predicate constant #, involving only three 
individual variables, 
F VWUVVVYXVYVWVZ 
(LAU, YI {L4X. YI (4 W, DIF U.X, WI>LF WV. Y,21)}). 


Similar stipulations also hold for predicate variables or constants involving more 
than three individual variables. 


2.5.4 Defined Ranking-Predicates 


Each application with a binary predicate constant & allows for a corresponding 
predicate @ of ranking, also called ordering or inequality, defined in terms of 
& so that @(A, B) is an abbreviation of the formula 


VX{[E(X, A)] > [F(X, B)]}. (2.3) 


Formula (2.3) is also denoted by A = B (read “A precedes B’’) instead of @(A, B). 
The resulting predicate @ is reflexive and transitive, but not necessarily symmetric, 
as verified in the exercises. 


2.5.5 Exercises on Equality-Predicates 


2.41. Verify that the ranking-predicate @(A, B) defined by formula (2.3) is reflex- 
ive: prove F VA[@(A, A)]. 


2.42. Investigate whether the ranking-predicate Z(A, B) defined by formula (2.3) 
is symmetric: is VAVB{[Z(A, B)] > [Z(B, A)]} is a theorem? 


2.43. Verify that the ranking-predicate (A, B) defined by formula (2.3) is transi- 
tive: prove F VAVBVC([&(A, B)] A [Z(B, C)]} > [B(A, C))). 


Exercises 2.45, 2.44, and 2.46 focus on the alternative ranking predicate / (A, B) 
defined in terms of the same binary predicate constant & as an abbreviation of 
formula (2.4): 


VX{[E(B, Y)] => [(A, Y)]}- (2.4) 


2.44. Verify that the alternative ranking-predicate (A,B) defined by for- 
mula (2.4) is transitive: prove VAVBYC([.o/(A, B)] A [of (B, C)]} => [o/(A, C)]). 
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2.45. Verify that the alternative ranking-predicate (A,B) defined by 
formula (2.4) is reflexive: prove Fk VA[.¢(A, A)]. 


2.46. Investigate whether the alternative ranking-predicate o/(A, B) defined by 
formula (2.4) is symmetric: determine whether VAVB{[.o/(A, B)] = [.o/(B, A)]} 
is a theorem. 


2.47. Verify that the equality predicate defined as in example 2.85 for set theory 
satisfies the alternative axioms Y 1 and ¥2 from subsection 2.5.3. 


2.48 . Verify that the equality predicate defined as in example 2.85 for set theory 
satisfies axioms .Y1 and .#2. from subsection 2.5.1. 


2.49. Verify that the equality predicate defined by axioms .%1 and .%2 from 
subsection 2.5.1 also satisfies the alternative axioms %1 and .¥2 from subsec- 
tion 2.5.3. 


2.50. Verify that the equality predicate defined by axioms 41 and _#2 from 
subsection 2.5.3 also satisfies axioms .41 and .#2 from subsection 2.5.1. 
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Chapter 3 
Set Theory: Proofs by Detachment, 
Contraposition, and Contradiction 


3.1 Introduction 


This chapter introduces set theory from two parallel perspectives: as an intuitive 
mathematical theory, and as a simple applied predicate calculus of first order. 
Starting from first-order logic and some of the Zermelo-Fraenkel axioms (exten- 
sionality, empty set, pairing, power set, separation, and union), where all objects 
under consideration are sets, the chapter first derives relations between sets, subsets, 
supersets, unions, intersections, and Cartesian products of sets of sets. Subsequent 
sections introduce relations, functions, injections, surjections, bijections, composite 
functions, and inverse functions. Another section focuses on the duality between 
partitions and equivalence relations. The last section deals with pre-orders, partial 
orders, linear or total orders, and well-orders. Many proofs begin with an informal 
intuitive proof, then demonstrate how to design a more formal proof, and finally 
present a detailed outline of such a formal proof in first-order logic. The other 
Zermelo-Fraenkel axioms (choice and infinity or substitution) are only mentioned 
here, because they form the topic of subsequent chapters. The prerequisites for 
this chapter consist of a working knowledge of first-order logic, for instance, as 
described in chapters 1 and 2, which contain all the logical theorems cited in this 
chapter. 

For some practical problems, features that are essential to their solutions can be 
specified in terms of sets or collections of objects. 


3.1 Example (Binary arithmetic). The binary arithmetic of computers relies on a 
set of two symbols, 0 and 1, which will be defined with yet other sets in this chapter. 


3.2 Example (Geometries). Geometries can be designed entirely with sets. Points 
are sets (of sets of coordinates), while lines, planes, and space are sets of points. 
Points, lines, planes, and space are “primitive” objects that may remain undefined, 
but relations between them are specified through axioms. For instance, the first 
axiom of incidence geometry specifies that through any two distinct points passes 
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exactly one line [61, p. 3]. Likewise in this chapter, mathematical “sets” are 
“primitive” objects that remain undefined, while features of sets and relations 
between sets are specified by axioms. 


The foregoing examples already demonstrate a major difficulty in using problems 
about “real” objects to illustrate logical and mathematical concepts: no exact 
answers might be available. For an example as elementary as binary arithmetic, 
electronic digital computers internally do not use anything like the symbols 0 and 
1: indeed, they use two electrical potentials confined to two mutually exclusive 
ranges, neither of which need contain any zero [48, p. 642], with different ranges on 
different machines [51, p. 60, fig. 3-1; p. 83], [138, p. 4-5, §1], in the reverse order 
on other machines [103, p. 1-4], sometimes reversing the order within the same 
machine [23, § 5]. A precise answer would involve more advanced engineering, 
logic, mathematics, and physics. Therefore, the sets in the present exposition will 
not contain “real” objects; instead, all the following sets will contain only abstract 
objects defined by precise rules. The judicious use of such abstract mathematical 
sets in applied disciplines from astronomy to zoology is precisely the task of such 
disciplines. 


3.2 Sets and Subsets 


This section introduces mathematical sets by means of the concept of set member- 
ship. The predicate of set membership then allows for the definition of the concepts 
of subset, superset, and a derived predicate of equality. 


3.2.1 Equality and Extensionality 


One of the major mathematical achievements around the beginning of the twentieth 
century was the realization that most of mathematics and computer science consists 
of logical relations between abstract objects called sets [8, p. 3], [83]. There is no 
definition of mathematical sets. Indeed, such a definition would have to define sets in 
terms of yet more foundational objects, but sets are the most foundational objects. 
Henceforth, here and as in other texts [22, p. 50], [141, p. 60], all mathematical 
objects are sets, and all quantified variables designate sets: 


andere Objekte als Mengen existieren fiir uns tiberhaupt nicht 
(“for us objects other than sets simply do not exist”) [36, p. 271]. 


Instead of a definition of sets, a few “axioms” specify certain characteristics of 
sets. Such axioms are also called postulates because they are appended to the list of 
axioms for the underlying logic [18, § 55]. Such axioms of logic describe universal 
patterns of reasoning applicable in all contexts. In contrast, postulates specify the 
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kind of objects considered in a particular context. The distinction between logical 
axioms and applied postulates is convenient to distinguish between reasoning and 
objects. Yet this distinction is somewhat artificial, because changing the logical 
axioms also changes the provable properties of the applied objects, which amounts 
to changing the kind of objects under consideration [18, p. 317, footnote 520]. 
Consequently, following tradition, the postulates of set theory are also labeled here 
as axioms. 

The set theory presented here involves one undefined primitive binary relation, 
denoted by € and called “membership.” The symbol € is a typographical variation 
on the lowercase Greek letter € (read “epsilon’”’), selected here as the first letter of the 
copula éoti (pronounced “es-tee”), meaning “is” [36, p. 272]. The notation X € Y is 
read in various ways as “X is an element of Y,” “X is a member of Y,” or “X belongs 
to Y.” For any set X and any set Y, the atomic formula X € Y has a Truth value, so 
that X € Y is either True (if X is an element of Y) or False (if X is not an element of 
Y). Thus, the following generalization of the tautology (B) Vv [=(B)] is universally 
valid: 


t VX(VY{(X € Y) v [A(X € ¥)}}). 


The relation € of membership is the only foundational relation between sets. 
Consequently, the only characteristics of a set are its elements. Because set theory 
involves only one relation, every set A can relate to other sets X and Y in only four 
ways: 


X EA, 
7(X EA); 
AeY, 
=(A€Y). 


Whether A € Y or -(A € Y) might also be considered as a characteristic of 
A. Consequently, rather than involving every element of A (every set X such that 
X €A), another way to define the characteristics of A might involve every set 
of which A is an element (every set Y such that A € Y). However, if the only 
characteristics of a set are its elements, then the two ways ought to be logically 
equivalent; this equivalence forms the essence of the axiom of extensionality. 


Axiom S1 (Axiom of extensionality) 
+ WA{WB[({VX[(X € A) > (X € B)} > {VY[(A € Y) & Be Y)})]}. 


In the axiom of extensionality (S1), the formula VX[(X € A) = (X € B)] states 
that each set X is an element of A if and only if X is an element of B. The formula 
VY[(A € Y) } (B € Y)] states that A is an element of Y if and only if B is 
an element of Y. The axiom of extensionality states that these two formulae are 
logically equivalent. 
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Yet another way to state that two sets have exactly the same characteristics 
involves a derived binary relation (derived from €) denoted by = and called 
“equality.” For each set A and each set B, the formula A = B (read “A equals 
B’) means that A and B have exactly the same characteristics. By the axiom of 
extensionality, the equality of two sets can be stated in two logically equivalent 
ways: 


+ WA{WB[(A = B)  {VX[(X € A) > (X € B)}}]}, 
+ WA{VB[(A = B) @ {VY[(A€ Y) & Be Y)]}]}. 
The notation A = B is a shorthand to state that the following two formulae hold: 
VX[(X € A) @ (XE B)], 


VY[(AEY) & (BeEY)]. 


The axiom of extensionality states that these two formulae are logically equivalent. 

A variation consists in defining A = B as an abbreviation of the first formula, 
VX[(X € A) > (X € B)], [8, p. 4-5], [36, p. 272-273, Def. 2], and then in adopting 
the axiom that if A = B then the second formula holds: VY[(A € Y) @ (Be Y)] 
[36, p. 274, Axiom J]. 

There is another presentation of set theory with two undefined relations, equality 
(=) and membership (€). Then the axiom of extensionality specifies that two sets 
are the “same” set if and only if they contain the “same” elements. In this exposition 
the distinction just made does not matter, because equality (=) serves only as a 
shorthand: all operations with sets pertain to elements of those sets. (See also the 
discussions by Bernays [8, p. 53] and Fraenkel [8, p. 6—8].) For the negations of 
membership and equality, the following abbreviations prove convenient. 


3.3 Definition. The symbols ¢ and ¥ denote the negations of € and = so that 
F VXVY{(X € Y) [A(X € Y)]}; 
+ VAVB{(A #4 B) > [=(A = B)}}. 


At the elementary stage of set theory, most formal logical proofs of relations 
between sets are straightforward, in the sense that they use only axioms and 
definitions to establish a sequence of equivalences between the objective of the 
proof and a theorem or universally valid formula. Such formal logical proofs are 
usually longer than “informal” proofs. To show a first example of a proof within set 
theory — a formal version and an informal version — the following theorem states 
that each set equals itself. 
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3.4 Remark. In designing a proof we may at any stage start from the conclusion 
— but we may not assume it as True — and then search for logically equivalent 
formulae that connect the conclusion to other formulae that we know how to prove. 
Smullyan’s method of tableaux uses such an approach [117, Ch. II, p. 15-30]. 


3.5 Theorem. Each set equals itself: the formula VS (S = S) is universally valid. 
Proof. An informal proof can consist of the following statements. 


e Every set X is an element of S if and only if X is an element of S; 
¢ hence S = S by definition of the equality of sets and extensionality (S1). 


One method to design a formal proof transforms the objective, here the yet unproved 
formula VS (S = S), into logically equivalent formulae, until one such equivalent 
formula appears that is a theorem thanks to an axiom or to a previously proven 
theorem. For instance, substituting S for A and also S$ for B in the axiom of 
extensionality gives 


S=S_ yet unproved, 
{= definition of = 
VX[(X €S) > (X €S)], 


which is in prenex form, and where the logical formula (X € S) = (X € S) has the 
pattern of the theorem (P) <> (P). Thus, a complete proof may proceed as follows: 


F (P) > (P) reflexivity of equivalence (theorem 1.63), 
F (xX €S) @ (XeES) substitution in the theorem (P) = (P), 

F VS{VX[(X € S) + (X €S)]} generalizations, first on X, then on S, 

F VS(S = S) definition of = and extensionality (S1). 


The proof just presented relies on one of the two formulae for the axiom of 
extensionality: VX[(X € A) <= (X € B)]. Another proof could rely on the other 
formula: VY[(A € Y) & (BE Y)]. Oo 


More generally, by the properties of a logical equality-predicate derived from a 
binary predicate, here €, as explained in section 2.5, the equality of sets is 


* reflexive: F VS(S = S) (every set equals itself), also proved in theorem 3.5, 

* symmetric: VAVB[(A = B) > (B = A)] GfA = B, then B = A), 

* transitive: VAVBYC{[(A = B)A(B= C)] > (A=OC)} GfA = BandB=C, 
then A = C), 


and equality also allows substitutions of mutually equal sets in theorems. 

The axiom of extensionality merely provides two logically equivalent criteria to 
test whether sets have exactly the same characteristics. However, so far in this theory 
there is no “set” yet. The “existence” of at least one set — or, more accurately, a 
convention about an abstract concept of a specific set — requires a second axiom. 
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3.2.2. The Empty Set 


The second axiom, called the axiom of the empty set, guarantees the existence of 
at least one set, denoted by @ or also by { }; this set contains no element. 


Axiom S82 (Axiom of the empty set) - VX[-=(X € @)]. 


Set theory could dispense with the constant @ and state the axiom of the 
empty set in the alternative form / JE[VX(X ¢ E)]. In the present theory, this 
alternative form is a consequence of axiom S2. Indeed, with P denoting the formula 
VX(X ¢ E), theorem 2.48 becomes 


[Subf;,(P)] > [BE(P)]. 
[VX(X ¢ E)] => {SE[VX(X ¢ E)]}. 


The alternative form F JE[VX(X ¢ E)] is more cumbersome, because it does not 
provide a name for any such set. 

The determination of the Truth value of an equality A = B requires prior 
definitions of both sets A and B. In contrast, there exists a different use of the 
same concept of equality, denoted by C := D (read “let C equal D”’) to specify 
a hitherto undefined set C in terms of an already defined set D [59, p. 8], [121, p. 5]. 
Alternatively, the notation D =: C (also read “let C equal D”) can also serve to 
specify C in terms of D, especially where a derivation leads to a lengthy formula D, 
which can thus be abbreviated by a shorter variable or string C [121, p. 271, p. 347]. 


3.6 Example. The notation E := @ specifies that E stands for @. 


3.2.3 Subsets and Supersets 


In some circumstances, only some of the elements of a set prove useful; the 
following definition then allows for the grouping of all such elements into a “subset.” 


3.7 Definition (Subsets and supersets). For each set A, for each set B, the set A is 
a subset of the set B if and only if each element of A is also an element of B. Either 
notation A C B or A C B indicates that “A is a subset of B’”’; thus, 


+ WA{WB[(A C B) > {VX[(X € A) > (X € B)]}]}. 


Similarly, a set B is a superset of a set A if and only if A C B, a relation also denoted 
by BDAorB2A. 


In the definition of subsets, the equivalence (<>) states that the relation of subset 
(A C B) is logically equivalent to the formula VX[(X € A) => (X € B)]. In this 
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formula, for each set X the logical implication (=-) states that if X is an element of 
A, then X is also an element of B. 

The concept of subset is so different from the concept of element as to warrant 
different terminologies, for instance, reading A C B as “A is a subset of B” but 
reading A € B as “A is an element of B.” In contrast, such vague phrases as “A is 
in B” or “B contains A” do not have any significance, unless they are supplemented 
with “as an element” or “‘as a subset” [36, p. 272]. 

There also exist symbols more specific than A C B. For instance, A C BorA ¢ B 
orA S B indicate that A is a subset of B different from B; thus, 


+ WA(VB{(AGB) & (AGB) & (ACB) S [(ACB)A(AFB)]}). 


Similarly, B > A or B 2 A or B 2 A stand for (B > A) A (B FA). 

(However, some authors use C to mean C [140, § 3.12, p. 59-60].) 

The following theorems provide further examples and practice with proofs in set 
theory, and demonstrate how to design a proof. The first three theorems establish 
features of the concept of subset: reflexivity, anti-symmetry, and transitivity. 


3.8 Theorem (reflexivity of C). Each set is a subset of itself: | VS(S C S). 


Proof. An informal proof may consist of the single statement that each element of 
Sis also an element of S, whence S is a subset of S, by definition 3.7. 

The design of a more formal proof can transform the set-theoretic formula S C S 
into a logical formula, and verify that it is a logical theorem: 


VS(S CS) yet unproved, 
{= definition of C 
VS{VX[(X € S) > (X € S)]} 


which is in the prenex form VS {VX[(P) = (P)]}, where the matrix is the theorem 
(P) => (P) (theorem 1.14), here with X € S instead of P. Thus a complete proof can 
proceed as follows. 


F (P) => (P) theorem from implicational logic (1.14), 
F (xX €S)> (XeS) substitution in theorem 1.14, 

F VS{VX[(X € S) => (X € S)]} — generalizations, first on X, then on S, 

F VS(S C S) definition (3.7) of subsets. 


iz 
3.9 Theorem (transitivity of C). For all sets A, B, and C, ifA C BandB CC, 
thenA CC: 
F VAVBYC{[(A CB) A (BC OC) > ACO}. 


Proof. An informal proof can proceed as follows. 


e If each element of A is an element of B, 
e and if each element of B is an element of C, 
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e then each element of A is an element of B and hence also an element of C, 
¢ whence A is a subset of B, by definition of subset. 


The design of a formal proof can unravel the set-theoretic formula (A C B) A 
(B C C)] = (A C C) into a logical formula by means of the definition of subset, 
and verify that the resulting formula is a theorem from logic: 


(ACB)A(BCO)] > (ACC) yet unproved, 
¢ definition of C 


({VX[(X € A) > (X € B)]} A {VX[(X € B) > (X € OC)]}) 
=> {VX[(X € A) > (X € C)]} 


{} theorems 2.54, 2.75, and 2.76, 


[WX ({[(X € A) > (X € B)]} A {[(X € B) => (X € C)}})| 
=> {VX[(X € A) > (X € C)]} 


ft theorem 2.35, 


VX{({[(X € A) > (X € B)]}} A {[(X € B) > (X € C)]}) 
= [XeA)> (Xe C)]} 


which is in prenex normal form, with a matrix of the type 


{[(P) > QI AQ) = (Ii > (P) > (I, 


which is another form of the transitivity of the implication (theorem 1.76). Thus the 
last line is a theorem. Reversing the order of the steps and inserting VA and VB 
before each step (generalizing) then completes the proof. oO 


3.10 Theorem (anti-symmetry of ©). Two sets are subsets of each other if and 
only if they equal each other: 
+ VA(VB{(A = B) > [(A C B) A (BC A)}}). 


Proof. An informal proof can proceed as follows. 


e If each element of A is an element of B, 

e and if each element of B is an element of A, 

e then A and B have exactly the same elements; 

¢ hence A = B by extensionality, and conversely. 
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A formal proof can establish the following sequence of logical equivalences. 


(A C B) A (BCA) 

¢ definition of subset, 

{VX[(X € A) > (X € B)]} A {VX[(X € B) > (X € A)]} 

¢ thanks to F {[¥X(P)] A [VX(Q)]} & {VX[(P) A (Q)]}. by theorem 2.75, 
VX{[(X € A) => (X € B)]A [(X € B) > (XE A)}} 

¢ definition of =, 

VX[(X € A) & (X €B)| 

¢ axiom of extensionality ($1). 


A=B 
Inserting VA and VB before each step (generalizing) then completes the proof. O 


The reflexivity, anti-symmetry, and transitivity of the concept of subset also result 
more generally from the properties of a logical ranking-predicate, here C, derived 
from a binary predicate, here €, as explained in section 2.5. 

The next theorems focus on the subsets and supersets of the empty set. 


3.11 Theorem. The empty set is a subset of every set: | VS(@ CS). 
Proof. An informal proof can use the converse law of contraposition (axiom P3): 


e Every set not in S is also not in @, because no set belongs to @; 
¢ the contraposition then means that every element in @ is also in S; 
¢ hence @ is a subset of S, by the definition of subsets. 


The design of a more formal proof can transform the set-theoretic formula @ C S 
into a logical formula, and verify that it is a logical theorem. 


VS(@ CS) yet unproved, 
¢ definition of C 
VS{VX[(X € @) => (X € S)]} 
¢ contraposition and theorem 2.45 
VS{VX[(X € S) => (X ¢ S)]} 


which is in the prenex form VSVX[(P) => (Q)], where the matrix (P) > (Q) isa 
theorem, because so is Q. Thus a complete proof may consist of the following steps. 
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F VX(X € @) axiom of the empty set (S2), 
-X€ED® logical axiom of specialization (Q1), 
F(X €S) > (X ~ @) theorem 1.12 from implicational logic, 


F VS{VX[(X € S) > (X € @)]} — generalizations, 
F VS{VX[(X € B@) => (X € S)]} contraposition and theorem 2.45 


F VS(@ C S) definition 3.7 of C. 
Oo 


3.12 Theorem. Every subset of the empty set is the empty set: / WS[(S C @) > 
(S = 2) 


Proof. Theorem 3.11 already guarantees that the empty set is a subset of every set: 
- WS(@ C S), in particular, for every subset S of the empty set, for which the 
reverse inclusion also holds (S C @). Hence S = @ by theorem 3.10. 


L VS(@ CS) theorem 3.11, 

F@acs specialization (axiom Q1), 
K(S@CS)> {SC BG) > [(S@CS)A(SC B)]} theorem 1.54, 

K(SC GB) > [(B@CSSA(SC B)] Detachment, 

LF (SC @) > (S=@) theorem 3.10 and transitivity, 
- VS[(S C B) > (S = @)] Generalization. 


oO 


The axioms of extensionality (S1) and of the empty set (S2) apply to all sets. Yet 
so far the theory guarantees the existence of only one set, namely the empty set @, 
for which @ € @ is False, but @ = @ and @ C @ are True. Sets other than the 
empty set require further axioms, as explained in the next section. 


3.2.4 Exercises on Sets and Subsets 


3.1. Write a logical formula stating that a set S is not the empty set. 
3.2. Write a logical formula stating that a set A is not a subset of B. 
3.3. Write a logical formula stating that a set A is not a superset of B. 
3.4. Write a logical formula stating that a set A is not equal to a set B. 
3.5. Prove that @ € @ is not a theorem. 

3.6. Prove that @ C @ is a theorem. 

3.7. Prove that @ = © is a theorem. 


3.8. Use the second formula, VY[(A € Y) = (B € Y)J], in the axiom of 
extensionality to write a proof that S = S for each set S. 


3.9. Prove that @ is the only subset of @: VS[(S C @) > (S = @)]. 
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3.10. For each set S, prove that S 2 S is not a theorem. 

3.11. Prove that if a set S is the only subset of itself, then S is empty. 

3.12. Prove that there does not exist a set S such that S$ € Y for every set Y. 
3.13. Prove that if a set S is a subset of every set, then S is empty. 

3.14. For all sets A, B, and C, prove that if A = B and B= C, then A = C. 
3.15. For all sets A, B, and C, prove that if A € Band B¢ C, thenA € C. 
3.16. For all sets A, B, and C, prove that if A C Band BC C,thenA CC. 
3.17. For all sets A and B, prove that A = B if and only if for every set Z 


(ACZ)S (BCZ). 

3.18. For all sets A and B, prove that A C B if and only if for each set Z 
(BCZ)=> (ACZ). 

3.19. For all sets C and D, prove that C > Dif and only if for each set W 
(D2W)=> (CDW). 

3.20. For all sets R and S, prove that R D> S if and only if for each set W 


(WES) > (WER). 


3.3 Pairing, Power, and Separation 


This section introduces axioms to form sets with one or two elements (pairing), all 
subsets of a set (power), or selections of specific elements into a subset (separation). 


3.3.1 Pairing 


A theory allowing for sets other than the empty set requires additional axioms. For 
instance, the axiom of pairing states that for every set H and every set K, there 
exists a set L, also denoted by {H, K}, which contains only the elements H and K. 


Axiom S3 (Axiom of pairing) 


+ WH{VK[AL(VX {(X € L)  [(X = H) v (X = K)}})]}. 
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In the axiom of pairing (S3), the equivalence (<>) states that a set X is an element 
of Lif and only if X equals H or X equals K. Because the logical “or” is inclusive, 
the axiom of pairing thus allows {H, K} to contain both H and K. Moreover, because 
the logical “or” commutes, the order in which H and K appear does not matter. 


3.13 Theorem. For each set H and for each set K, + {H, K} = {K, H}. 
Proof. An informal proof can compare the elements of {H, K} and {K, H}: 


¢ The set {H, K} contains the elements H and K, but no other element; 
¢ the set {K, H} contains the elements K and H, but no other element; 
¢ thus {H, K} and {K, H} contain exactly the same elements; 

¢ therefore {H, K} = {K, H} by the axiom of extensionality (S1). 


A formal proof can use + [(P) v (Q)] & [(Q) v (P)] (theorem 1.79) to show that 
[(X=Hv(X=K)] 6 (K=K)vX=H) 

is a theorem, and hence also 

(X € {H,K}) © [(X =H) v (X=K)] & [(X=K) Vv (X =H) © (XE {K,F}), 


whence + {H, K} = {K, H} by the axiom of extensionality (S1). o 


With H := S and K := S, the following theorem shows that for each set S, there 
exists a set, denoted by {5}, which contains only one element, S. 


3.14 Theorem. For each set S there exists a set L, also denoted by {S}, which 
contains only the element S; formally, F VS(AL {VX [(xXeLlL) oe X= S)]}). 


Proof. An informal proof can use the axiom of pairing with H := S and K := S: 


¢ For each set S, the axiom of pairing yields a set {S, S}; 

* by the axiom of pairing, X € {S, S} if and only if X = Sor X = S; 
° yet “X = Sor X = S” merely repeats the same statement “X = S”; 
* therefore {S,S} = {5S} by the axiom of extensionality. 


A formal proof can rely on (P) = [(P) Vv (P)] (theorem 1.68): 


+ WH{VK[AL(VX {(X € L) @ [(X = H) v (X = K)]})]} axiom $3, 


+ AL(WX {(X € L) + [(X = S) v (X = S)]}) Subfy, Subf§, 
F WS[AL(VX {(X € L) @ [(X = S) v (X = S)]})] Generalization, 
+ WS(AL{VX[(X € L) = (X = S)]}) L(P)[(P)V(P)]. 


oO 
3.15 Definition (singleton). A singleton is a set containing exactly one element. 


3.16 Example. In theorem 3.14, substituting @ for S gives the set L = {@}, which 
contains the single element @. In particular, @ € {@}, so that {@} is not empty. 
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3.17 Remark. The distinction between @ and {@} is crucial. With different nota- 
tions, to appreciate better the difference between @ = { } and {o} = {{ }, 
consider @ as an empty bag; then {@} is a bag {...} with another empty bag @ 
inside it, also known as a “double bag” in the market place. There, a single empty 
bag { } = @ might not be sufficiently strong to hold a six-pack of heavy glass 
bottles filled with your favorite beverage. (Bottled water, of course, what were you 
thinking?) That’s why you ask for a double bag {o} = { { }} in which to put and 
then carry the heavy six-pack. If the six-pack also comes with a wrapping, then 
the combined packaging becomes a “triple bag”: \{2 }} = {Kt Hh which is yet 
another set. 


Theorem 3.18 confirms that the sets @ and {@} have different characteristics. 
3.18 Theorem. The sets @ and {@} are two distinct sets: @ # {O}. 


Proof. An informal proof can utilize substitutions in previous axioms and 
theorems: 


¢ By definition of the empty set, @ ¢ @ (by a substitution in axiom S82); 

* moreover, @ € {@} (by a substitution in theorem 3.14); 

* hence the two sets @ and {@} have different elements: @ € {@} but @ € @; 
* consequently, @ # {@}, by the axiom of extensionality. 


The following formal proof uses contraposition in the theorems 


r (P) > {[P) > (Q)] => (Q)} law of assertion (theorem 1.38), 
F (P) => ([-(Q)] => {|[(P) > (Q)]}) theorem 1.54. 


whence if P and —(Q) are theorems, then so is =[(P) = (Q)] by Detachment: 


F =(@ € @) —=(Q): substitution in axiom S82, 
F @ € {O} P: substitution in theorem 3.14, 
F—[@ €{9}) > @ € @)| =[P) > @)], 


+ AX{—[(X € {}) > (Xe @)]} —Subf§ with theorem 2.48, 

b a{WX[(X € {9}) > (XE S)]} {AX[A(P)]} > {-[VX(P)]} (axiom Q4), 
F A({S} C B) definition 3.7 of subsets, 

+ @ A {@} contraposition of theorem 3.10. 


The distinction between @ and {@} allows for the formation of other sets. 


3.19 Example. Substituting H := @ and K := {@} in the axiom of pairing gives 
the set L = {H, K} = {@, {@}}. This set L has two elements, because @ # {@}. 


3.20 Example. Substituting S := {@} in theorem 3.14 gives the set L = {S} = 
{ {@} }. This set L has one element, in effect {@}; thus, {@} € { {}}. 


www.pdfgrip.com 


122 3 Set Theory: Proofs by Detachment, Contraposition, and Contradiction 
3.3.2 Power Sets 


For sets with more than two elements, a new axiom becomes necessary. The axiom 
of the power set states that for each set A, the collection of all subsets of A forms a 
new set, denoted by Y or by (A) and called the power set of A: 


Axiom S4 (Axiom of the power set) 
+ WA(AP {VS[(S € P) > (SC A)]}). 


In the axiom of the power set (S4), the equivalence (<>) states that a set S is an 
element of the power set Y(A) if and only if S is a subset of the set A. 


3.21 Example. The empty set © is the only subset of itself, by theorem 3.12. Hence 
its power set has only one element, @, so that P(@) = {OD}. 


A set A and its power set A(A) might have no elements in common. Theo- 
rem 3.22 shows that for every element X of A, the singleton {X} is an element 


of P(A). 
3.22 Theorem. For all sets A and X, if X € A, then {X} € P(A), and conversely: 


VAYX{(X € A) > [{X} € P(A)]}. 


Proof. An informal proof can rely on the definitions of subsets and power sets. 


¢ X € A if and only if {X} C A, by definitions of {X} and subsets; 
¢ {X} CA if and only if {Xx} ¢ A(A) by definition of power sets. 


A more formal proof carries out similar verifications from the formal definitions. 


VA(VX{(X € A) © [{X} € A(A)]}) yet unproved, 
¢ definition of power sets, 

VA(VX{(X € A) © [{X} C A]}) 

¢ definition of subsets, 

VA{WX[(X € A) + {¥Z[(Z € {X}) > (Z € A)]}]} 

¢ definition of {X} (pairing), 

VA{VX[(X € A) > {VZ[(Z = X) > (Z € A)]}]} 

¢ theorems 2.34 and 2.38, 

VA{VXWZ{(X € A) + [(Z = X) > (Z€ A)}}]} 


7s 


which holds by extensionality and substitutivity of equality. Oo 
Theorem 3.23 shows that the power set of a singleton has exactly two elements. 
3.23 Theorem. For each set H, P({H}) = {2, {H}}. 


Proof. An informal proof can list the subsets of {H} by cases: 
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{H} C {H} by theorem 3.8, 
* @ C {H} by theorem 3.11, 
° {2, {H}\ C Y({H}) by the previous two lines; 

¢ if BC {H}, then B has no element or B has the single element H, 
* hence {H} has no subsets other than @ and {H}, 

¢ thus A({H}) C {2, {3} by the previous two lines, 

* whence A({H}) = {2, {H}}. 


A formal proof can proceed through two cases, using theorem 1.85: 


F ([(P) > QI A -@)] > Q}) = (®). 


with P for B = @ and Q for B € {2, {o}}. In the first case, (B = @) > (Be 
{2, {@}}) by substituting {Z, {a} for Y in the axiom of extensionality (S1). In 
the second case, the definitions of B 4 @ and B C {H} give 


(BA) A (BC {H}) 
ft = definitions of @ and C, 
[AX(X € B)] A {VZ[(Z € B) > (Z € {A})]} 
¢ definition of {H} by theorem 3.14, 
[AX(X € B)] A{VZ[(Z € B) > (Z = A)]} 
)o (Z=M)=>{VY[(ZeY)(HeY)]} 
AX[(X € B) A {VZ[(Z € B) > (H € B)}}] 
{} specialization Subf¥. 
AX{(X € B) A [((X € B) > (HE B)}} 
(PAHO), 
HeB 


Thus B C {H}, but H € B, whence {H} C B, and hence B = {H}. Oo 


3.24 Example. The singleton {@} has two subsets: {@} by theorem 3.8, and @ by 
theorem 3.11. Moreover, {@} has no other subsets. Hence, A({@}) = {G, {D}}. 


3.25 Counterexample. The set A := {{@}} has no element in common with its 


power set P({{o}}) a { ©, {{o}} \. In particular, A is not a subset of A(A). 


The axioms of the empty set (S2), of pairing (S3), and of the power set (S4) allow 
for increasingly large sets, for instance, 


©, 
P(D) = {DO}, 
P(KD}) = { S, {Oy}, 
P({ 2, {o}}) = 1a, {a}, {{o}}, { a, {oy} \. 


However, the “selection” or “separation” of subsets requires yet another axiom. 
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3.3.3 Separation of Sets 


For each set A and for each logical formula P, the axiom schema of separation 
“separates” or “sets aside” from A a subset S consisting of all the elements X of A 
for which P is True. The logical formula P may not involve any free occurrence 
of the variable S, but it may involve free occurrences of X and of other variables, 
as indicated in the following axiom by the ellipsis ..., which may represent other 
free variables. Such a “separation rule” to form subsets differs from other axioms, 
because the logic used here allows for the quantification of the elements, with the 
symbols VX, but this logic has no provision for the quantification of formulae: it 
does not allow for expressions like VP with P standing for formulae. Thus, the 
“separation rule” provides a schema for infinitely many axioms, in effect one axiom 
for each logical formula, whence its name. 


Axiom S5 (Axiom schema of separation) For each set A, and for each logical 
formula P that does not contain any free occurrence of the variable S, there exists a 
subset S C A that consists of only those elements X € A for which P is True: 


+ WA[AS(VX {(X € S)  [(X € A) A (P)]})]. 
Common notations for the subset S have the formats 


S = {X: (XEA)A (P)}, 
S={XEA: P}. 


An alternative notation replaces the colon (:) by a vertical bar (|), but in the context 
of further mathematics such vertical bars become difficult to recognize against 
absolute values, norms, and other similar symbols. 


3.26 Example. Consider the set 


A := PS, {O}}) 
= { 2, {O}, KO}, {2.2}, 


and let P be the formula —(X = @). Then 


A = { 2, {G}, {{O}}, {@, {o} }, 
(P) = [A(X = @)], 

S = {X: (XEA)A [A(X = @)}]}, 

S= {XeEA: A(X = @)}, 

S = {{O}, {{O}, {@, {o} }. 


The existence of this set S would not have followed from the previous axioms. 
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The following examples demonstrate the axiom schema of separation with 
various instances of the formula P. Additional examples appear in the next section. 


3.27 Example. For each set A and for each set B, let P be (X € B). Then 
S={X: (XEA)A(X EB), 


which is usually denoted by S = AN B and called the intersection of A and B. 
3.28 Example. For each set A and for each set B, let P be —(X € B). Then 


S = {X: (XE A)A [A(X € B)]}, 


which is usually denoted by S = A \ B and called the difference of A and B. 


The symbol \ adopted here for the difference of two sets aims at avoiding 
confusions with the arithmetic difference of sets A — B in such further branches 
of mathematics as convexity, linear algebra, and functional analysis. 


3.29 Example. For each set A, and for all subsets B C A and C C A, let P be the 
formula (X € B) v (X € C). Then 
S={X: (XEA)A[XEB)V(XEC)]}, 


which is usually denoted by S = B U C and called the union of B and C. 
Unions and intersections of more than two sets form the topic of the next section. 


3.30 Remark. Zermelo introduced the axiom of separation (S5) as the “Axiom der 
Aussonderung” [145, p. 263, Axiom III], which also translates as the “axiom of 
triage” or “sifting” [8, p. 11, footnote 2]. With the axiom of separation, the theory 
presented here can be called Zermelo’s set theory. Fraenkel and Skolem substituted 
for the axiom of separation a more general axiom called the “axiom of replacement” 
(“Axiom der Erzetzung’”’) [36, p. 309, Axiom VIII], [116], which can be stated with 
the notation from definition 2.21 [22, p. 52]: 


VZ...VW({¥X[A!¥(Q)]} > VASBVC[(C € B) & (AD{(D € A) A [Subf,SubfZ(Q)}}) |) 
or, alternatively [128, p. 202], 
VA{[VWVXVYVZ{(X € A) A [Subfy (Q)] A [Subf? (Q)]} > (Y = Z)] 
=> (ABVY[(Y € B) > {4X(X € A) A [Subfy (Q)]}}])}. 
This formula can be better described with the concept of “mathematical function” 


defined in section 3.6. The axiom of separation follows from the axiom of 
replacement by substituting (P) A (W = X) for Q. 
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3.3.4 Exercises on Pairing, Power, and Separation of Sets 


3.21. Give examples of sets X, Y, Z with X € Y and Y € Z but X ¢ Z. 
3.22. Provide examples of sets X, Y such that X € Y but X € Y. 
3.23. Provide examples of sets X, Y such that X C Y but X € Y. 
3.24. Provide examples of sets X, Y such that X € Yand xX CY. 
3.25. Provide examples of sets A, X such that X € A but X ¢ P(A). 
3.26. Provide examples of sets A, X such that X C A but X & P(A). 
3.27. Prove that {G} ~ {{O}}. 

3.28. Prove that {9} A {@, {@}}. 

3.29. Prove that {{O}} A {@, {D}}. 

3.30. Determine whether {2} C { {@} i. 

3.31. Determine whether { {@} \ Gc { {@} i. 

3.32. Determine whether {2} C { @, {O} i. 

3.33 . Determine whether { {@} \ c { @, {OD} i. 


3.34. Prove that replacing V by A in axiom S3 yields the False formula 
VH{VK[AL(VX {(X € L) @ [(X = H) A (X = K)}})]}. 

3.35. For theorem 3.13, explain how the word “and” in the informal proof 

corresponds to the logical connective V in the formal proof. 


3.36. For each set S, prove that {@, S} C A(S). 


3.37. Prove that two sets equal each other if and only if they have the same power 
sets: F VA(VB{(A = B) © [P(A) = A(B)}}). 


3.38 . Identify the set {9} N { @, {O} i. 

3.39. Identify the set { {@} } ial { @, {O} \ 

3.40. Identify the set {9} N { {@} ‘ 

3.41. Identify the set {@} U { @, {O} and one of its supersets. 
3.42. Identify the set { {@} U { @, {O} and one of its supersets. 
3.43. For each set S, identify the set SN ©. 

3.44. For each set S, identify the set SU ©. 

3.45. Prove that WA[(A \ 2) = A]. 


www.pdfgrip.com 


3.4 Unions and Intersections of Sets 127 


3.46. Prove that WA[(A \ A) = 2]. 

3.47. Prove that WA(VB{[(A \ B) = 2] (AC B)}). 

3.48. Prove that VA(VB{[(A \ B) = A] + [(AN B) = @)}}). 
3.49. Prove or disprove VA{VB[(A \ B) = (B \ A)]}. 

3.50. Prove or disprove VA[WB(YC{[(A \B)\C] = [A\ B8\ C)]})]. 


3.4 Unions and Intersections of Sets 


3.4.1 Unions of Sets 


Many mathematical situations involve unions and intersections of any “collection” 
or “family” of sets, or, in other words, of any set of sets. For instance, the Venn 
diagram in figure 3.1 (adapted from Hamburger and Pippert’s [52, 53]) illustrates 
all the possible unions and intersections for a set of five sets. Scientists, for instance, 
biologists, also use such Venn diagrams [7, p. 402, Fig. 2(B)], [90, p. 1472, Fig. 3]. 
For the existence of such general unions and intersections, however, new axioms 
become necessary. Thus, the axiom of union asserts that for each set ¥ there exists 
a new set U, denoted by U £ and called the union of .¥, which consists of all the 
elements X of all the elements S of .¥: 


Axiom S6 (Axiom of union) For each set .¥, there exists a set U, also denoted by 
) ¥, which consists of all the elements that belong to any element of ¥: 


+ V.F(AU{WX|(X € U) > {AS[(S € F) A (X € S)]}]}). 


Fig. 3.1 Venn diagram for 
the unions and intersections 
of five sets [52, 53]. 
Scientists, for instance, 
biologists, also use such Venn 
diagrams [7, p. 402, 

Fig. 2(B)], [90, p. 1472, 

Fig. 3]. 
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In the axiom of union (S6), the equivalence (<>) states that a set X is an element 
of the union U = |) ¥ if and only if there exists an element S of ¥ such that X € S. 


3.31 Example. If ¥ = @, then (() @) = @, because in the axiom of union (S6) 
the condition (S € ¥#) is False for every set S, and then the equivalent condition 
X € (LJ @) is False for every set X. Thus _) @ contains no element. 


For the union of two sets, a special notation proves convenient. 
3.32 Definition. The notation A U B stands for |_){A, B}. 

With only two sets in ¥, the definition of _) F simplifies considerably. 
3.33 Theorem. For all sets A and B, if F = {A, B}, then 


VX {[x € Us. a3] & [(X EA) V (XE By) 
Proof. For # = {A, B}, the axiom of pairing (S3) shows that 
(Se F) > [((S =A) v (S= B)], 
whence the axiom of union (S6) gives the following condition for X € (A U B): 


[X € (AU B)] > {AS[(S € {A, B}) A (X € S)]} 
<> (AS{[(S = A) v (S = B)] A (X € S)}) 
<> (AS{[(S = A) A (X € S)] v [(S = B) A (X € S)]}) 
<> {AS[(X € A) v (X € B)]} 
> [(X € A) V (X € B)]. 


The last equivalences result from extensionality, which gives 


[(S =A) A (X €S)] & (X € A), 
[(S=B)A XeES)] S&S (X EB). 


Moreover, S does not occur in (X € A) Vv (X € B), which makes AS superfluous, so 
that {AS[(X € A) v (X € B)]} = [(X € A) Vv (X € B)] by theorem 2.49. Oo 


3.34 Example. For A := {2, {oy} and B := { {@, {o}\ \, the union A UB consists 
of the elements that belong to A (these are @ and {@}) or that belong to B (where 
there is only one element, {2, {O}}). Therefore, AU B = { @, {OD}, {2, {oy} \ 


3.35 Example. Abbreviations are common [81, p. 98], [83, p. 453], [128, p. 129]: 
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0:= ©, 
1 := OU {0} = BU{O} = {o}, 


i) 
il 


1U {1} = {2} U {{O}} = {S. {B}}. 


3 = 2U 2} = {o {ou {{o, to}}} = (2, {2}, {2,{o}}}, 


4:=3U3}= @, {D}, {S, {OH}, {2.{2}, {2,{o}}} \ 


5:= 4U {4}, 
6:= 5U {5}, 
7 := 6U {6}, 
8:= 7U {7}, 
9 := 8 U {8}. 
The sets 0, 1 are the binary digits; 0,1, 2,3, 4,5, 6,7, 8, 9 are the decimal digits. 


3.36 Example. For all sets A, B, and C, there exists a set V whose only elements are 
A, B, and C. Indeed, the axiom of pairing yields two sets {A, B} and {B, C}. With 


F = { {A, By, {B,Ch}, 
the axiom of union produces a set |_) ¥ = {A, B} U {B, C} such that for each set X, 
[X € ({A, B} U {B, C})] <> [(X € {A, B}) v (& € {B, C})] 
= {([(X =A) V(X =B)v[(X=B)v (X= O)}} 
© [(X =A) Vv (X=B)v (X= C)]. 
The common notation {A, B, C} can replace {A, B} U {B, C}, so that 
{A, B, C} = {A, B} U {B, C}. 


3.37 Example. If the set Y contains only three sets A, B, and C, so that F = 
{A, B, C}, then | ){A, B, C} is also denoted by A U B U C, so that 


AUBUC=| J{A,B,C} =|(JF 


consists of all the elements that are elements of at least one of A, B, or C. 


The following theorems show that unions and intersections of sets have features 
similar to — and derived from — those of logical disjunctions and conjunctions. 
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3.38 Theorem. The union of sets commutes: + VA {VB[(A U B) = (BUA)}]}. 


Proof. An informal proof may consist of the statements that a set belongs to A U B 
if and only if it belongs to A or B, which means that it belongs B or A, and hence 
to BUA, whence A U B = B UA by the axiom of extensionality. A formal proof 
reveals that the commutativity of the union U corresponds to the commutativity of 
the logical connective V in the proof of theorem 3.13, which states that {A,B} = 
{B, A}: 


AUB = |_J{A,B} = |_ {B.A} = BUA. 


3.39 Theorem. The union of sets is associative: 
+ WAVB[VC({[(A U B) U C] = [AU (BU C)}})]. 


Proof. An informal proof can merely point out that a set “A or B, or C” is equivalent 
to “A, or B or C.’ A formal proof reveals that the associativity of the union U 
corresponds to the associativity of the logical connective Vv in a translation of 
[(AUB) UC] = [AU (BU C)] into a theorem with atomic formulae and connectives: 


[((AUB)UC]=[AU(BUOC)] yet unproved, 
¢ axiom of extensionality (S1), 
VX ({X € [((A U B) U C}} 
 {X € [AU (BU O)}}) 
¢ theorem 3.33 twice, 
VX ({[(X € A) Vv (X € B)] Vv (X € C)} 
> {(X € A) Vv [(X € B) Vv (X € C)]}) 


which is in prenex form with a matrix that is a theorem: 


tP) vV (Q)] v (R)} > {(P) Vv [(Q) Vv (ADI. 


in other words, by associativity of the logical disjunction V, by theorem 1.71. oO 


The following theorem shows that if a set B is an element of a set F, then B isa 
subset of the union |_) ¥. 


3.40 Theorem. For each set ¥ and for each set B, if B € F, then B C (UF): 


tvF (vB{(Be F) = [Bc (U)]}). 
Proof. An informal proof can substitute B for S' in the axiom of union (S6): 


¢ IfBe F, then for each X € B there exists S € ¥ with X € S, namely S = B; 
* hence X € |) ¥ for every X € B, by definition of |) ¥ (axiom S6); 
* consequently B C |) F by definition of subsets (definition 3.7). 
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A formal proof consists in transforming the proposed formula until a theorem 
appears. Because two steps involve not an equivalence but an implication, however, 
the final proof reorders the investigative five steps in their reverse order: 


(5) VF(VB{[B € F]| = [BC (U F)]}) yet unproved, 
+ definitions of , 

(4) vF|vB([B € F) => {WX{(X eB) > [KE Fy) 
+ definitions of U, 


(3) vF|VB(IB e F] = {vx[(X € B) > GA[A€ F)A (Xe Ayhy})] 


f} no X in (B € F), hence {VX[(P) > (Q)]} => {(P) => [VX(Q)]} 
(theorem 2.38), 


(2) V.F\WB(YX{(B e ¥] => [(X eB) > GAlAc FA (Xe A)}]})} 
ft F [Subf3(P)] > [24(P)], 


(1) VF(VB{VX[(B e F) => {(X € B) = [(BE F) A (X € B)]}]}) 
which holds thanks tok (P) = {(Q) => [(P) A (Q)]} (theorem 1.54). Oo 


Theorem 3.41 shows that if ¥ is a set of subsets of a set A, then its union |) ¥ 
is a subset of A. 


3.41 Theorem. For all sets A and Ff, if ¥ C A(A), then\) F CA. 


Proof. By the axiom of union (S6) 
VX \(x eV F) & {AB[(X € B)A (Be Fy , 
Yet (B € #) => (BCA) by the hypothesis that # C A(A), whence 
[(X € B) A (Be F)| => [(X EB) A(BCA)] 


by theorem 1.82, and hence X € A by definition 3.7 of subset. Consequently, 
because X € A has no free occurrences of B, theorem 2.57 gives 


vx[(xeU¥) = (xeA)], 


so that |) .¥ C A again by definition 3.7 of subset. Oo 
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3.4.2 Intersections of Sets 


Based on the axiom of union and the axiom of separation, definition 3.42 specifies 
for each nonempty set ¥ a new set, denoted by () ¥ and called the intersection of 
F , which consists of every element X that is an element of every element of .¥. 


3.42 Definition (intersection of sets). For each set .¥, apply the axiom of union 
to define & := |) F, and apply the axiom of separation to the set .& and to the 
formula 


VY[VY € F) > (XE Y)). 
Then define the intersection of .¥, a set denoted by ( ) #, through the formula 
VF = \xe F:NVVYe F)3 (Xe yy, 
so that 
vx{(xe()F) [Xe lJA AW" € F) > (XE Y)}}]}. 


The definition of the intersection (| -¥ of a set of sets ¥ states that a set X is an 
element of () ¥ if and only if VY[(Y € ) => (X € Y)] holds, which occurs if and 
only if X is an element of every element Y of .F. This definition also holds if F 
is empty because of the requirement that (| ¥ first be a subset of the union |) F. 
(This definition of () in terms of |) conforms to Bernays’s [8, p. 14].) If the set 
F contains only two elements, then the definition of () ¥ simplifies considerably. 


3.43 Theorem. For all sets A and B, if F = {A, B} then 


( \A, B} = {X: [X € (AUB) A[(X EA) A(X BD}. 


Proof. Apply theorem 1.55, [(H) = (K)] => ([(A) > (L)] > {(A) => [(K) A 
(L)]}): 


F (VY{(Y € {A, B}) > (Xe ¥)}) 

=> [(A € {A, B}) > (X € A)] _ specialization Subf?, 
F A € {A, B} axiom of pairing (S3), 
FKXEA Detachment; 
t (WY{(Y € {A, B}) = (X € Y)}) 

=> [(B € {A, B}) > (X € B)] _ specialization Subf}, 
- Be {A, B} axiom of pairing (S3), 
-}XEB Detachment; 
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F (VY{(Y € {A, B}) > (Xe Y)}) 
=> [(X €A) A(X € B)] theorem 1.55; 


 (X € A) => [(A € {A, B}) > (X € A)] from axiom PI, 
- (X € B) => [(B € {A, B}) > (X € B)] from axiom PI, 
F [(X € A) A (X € B)| 

=> (VY{(Y € {A, B}) > (X € Y)}) 


which holds thanks to theorem 1.82: 


F {[(P) > (QA ((R) = (S)I} > 1) A (R)] > [(Q) A (SB. q 


For the intersection of two sets, a special notation proves convenient. 
3.44 Definition. The notation A M B stands for (){A, B}. 
3.45 Example. For the sets 


A:= {, {2}, {{O}}}, 


B: 


{ {2}, {{O}}, LIP}, 


A B contains only the elements belonging “simultaneously” to both A and B: 
ANB={G, {S}, (O}}} LIS} US. LASS 
= {{O}. {1} } }. 


Because the definition of the intersection of sets relies upon the conjunction /, 
the intersection has formal features similar to the logical features of the conjunction, 
for instance, commutativity and associativity, as demonstrated in the following 
theorems. 


3.46 Theorem. The intersection of sets commutes: 
F VA{VB[(AN B) = (BN ADI}. 


Proof. An informal proof can state that a set X is an element of A and B if and only 
if X is an element of B and A. A formal proof shows that the commutativity of the 
intersection M corresponds to the commutativity of the logical connective A: 


(AN B) =(BNA) _ yet unproved, 
¢ axiom SI, 
VX {[X € (AN B)] > [X € (BN A)]} 
¢ theorem 3.43, 
VX {[(X € A) A (X € B)] > [(X € B) A (X € A)]} 
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which holds by the commutativity  [(P) A (Q)] <> [(Q) A (P)] (theorem 1.57). O 


3.47 Theorem. The intersection of sets is associative: 
F WA[VB(VC{[(A NB) NC] = [AN (BN C)}})]. 


Proof. An informal proof can rely on the equivalence of “A and B, and C” with 
“A, and B and C.” A formal proof shows that the associativity of the intersection N 
corresponds to the associativity of the logical connective A. 


[(ANB)NC]=[AN(BNC)] yet unproved, 
¢ axiom of extensionality (S1), 
WX({X € [(AN B) NCH} 
 {X € [[AN (BN C)}}) 
¢ theorem 3.43 twice, 
VX({[(X € A) A(X € BJ] A(X €C)} 
<> {(X € A) A [(X € B) A (X € C)]}) 


which holds thanks to the associativity of A (theorem 1.66): 


(PAI A(R)} > {(P)ATQ)A(B)]}- Oo 


Theorem 3.48 shows that the intersection (| ¥ of a set of sets ¥ is a subset of 
every element of ¥. In other terms, for each set B, if B € F, then (() -F) CB. 


3.48 Theorem. For each (nonempty) set F and for each B € F, ((\F) CB: 


LVF (vB \(e EF)> (N) e B]}) 
Proof. An informal proof can consist of the following statements: 


* IfX €()-F, then X is an element of every Y € ¥, and hence of B € F; 
* consequently, (().¥) C B, by definition of subsets. 


A formal proof can consist in transforming the proposed formula into its prenex 
form until a theorem appears in its matrix. 
VF (VWB{(B E F) => [()F) C B]}) yet unproved, 
definition of C, 
v7 {val (B € F) => {VX[(KE().F) => (XE Bh 
definition of (), 
v7 {val (B € ¥)=> 
{vx[{vY[(Y € F) > (XEY)} > (XE B)}}]} 
ff noXinB € F (theorem 2.38), 
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VF\WB| VX{(B e F) > 
VY" « F) > (Xe VY} > Xe B)]} | 


¢ theorem 1.37, 
P 


—————— ee 
VF Bl yxX({YYI7 € F) > (XEY)} 
> [eH > ea} 
Q R 
which holds by specialization (Subf;) and the law of commutation (theorem 1.37): 
t [(P) > {(Q) > (R)}] + [@ > {P) > @®)}]. 
| 


3.49 Definition (disjoint sets). Two sets A and B are disjoint if and only if 
ANB=@. Similarly, a set of sets ¥ is pairwise disjoint if and only if either 
A= BorANB = @ forall elements A and B of F. 


For the union of disjoint sets, a special notation proves convenient. 


3.50 Definition (disjoint unions). A union A U B is disjoint if and only if AN B = 
@; only for disjoint sets, the notation AUB stands for A U B. Similarly, a union J F 
is pairwise disjoint if and only if either A = B or ANB = @ for all elements A and 
B of F; only for pairwise disjoint sets, the notation |_).¥ stands for ) ¥. 


3.4.3 Unions and Intersections of Sets 


Another notation for unions and intersections of a set Y proves convenient, 
especially where the set Y relates in a specific way to a second set ¥. The new 
notation may then “index” the elements of ¥ with the corresponding elements of .F. 


3.51 Example. For each set ¥ C A(U) of subsets of a set U, consider the set 
of all the complements U \ S of all the elements S € #; thus, 


G := {Be PU): [Se F)aA(B=U\S)}. 


The indexed notation then denotes the union and the intersection of Y as follows: 


L@w\59:= Us. 


SEF 


(\@\5 = [)¢%. 


SEF 


The notation on the left-hand sides avoids the need to write a formula for the set 4. 
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Because the definitions of the intersection and union of sets rely on conjunction 
and disjunction, the union and intersection have formal features similar to the logical 
features of the conjunction and disjunction, for instance, distributivity. 


3.52 Theorem (De Morgan’s Laws). For each set U, and for each set F C 
PU) of subsets of U, the complement of the union equals the intersection of the 
complements, 


U\(UF)=(lw\a, 


ACF 


whereas for each set U, and for each nonempty family # © PAU) of subsets of U, 
the complement of the intersection equals the union of the complements, 


U\((\F) = (Jw\a). 


ACF 


Proof. For the complement of the intersection, an informal proof can proceed as 
follows. 


* For each set X, X € U \ (()\F) if and only if X € U but X ¢ (() -F); 

* by definition, X ¢ (() #) if and only if there exists A € FY with X ¢ A; 

* hence X € U \ ((\-F) if and only if X € U and there exists A € .F with X ¢ A; 
* equivalently, X € U \ ((\-F) if and only if there exists A € YF with X € (U\ A); 
* hence X € U \ (() F) if and only if X € U,eg(U \ A). 


The foregoing informal proof does not justify the permutation of the two 
statements “X © U” and “there exists A € #” but the following formal proof 
justifies such a permutation by the absence of any free occurrence of A in X € U. 


U\ (1 F) = User U \A) 


{= (axiom S1), 
VX {[X € {U\ (1) F)}] @ X © User U \ AD} 
¢t ), 


WX {XK € UA {> (XK € 1) FY} @ KK € User (U \ AD} 
¢ USN, 
VX {[(X € U) A (A{VA[(A € F) > (X € AD]})] 
@ {tFA[(A € FIA (Xe UNA [MX € ADS} 
—[VA(P)], 
. JA[-(P)], 
VX {[(GA {(X € U) A {>t € F)} V(X € ADHD 
@ [FA{A € F)A [Xe UNA [>X € ADI 
¢ De Morgan, 
VX {[GA {(X € U) A (A € -F) A [> € AD]}))] 
@ {FA[(A € F)A{(X € UNA [>(X € ADT} 
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which holds by associativity F {(P) A [(Q) A (R)] = {[(P) A (Q)] A (R)} 
(theorem 1.66). 
For the complement of the union, an informal proof can proceed as follows. 


¢ For each set X, X € U \ (LF) if and only if X € U but X ¢ (UF); 

* by definition, X ¢ (J F) if and only if X ¢ A for every A € F; 

* hence X € U \ (J F) if and only if X € U and X ¢ A for every A € F; 
* equivalently, X € U \ (LF) if and only if X € (U \ A) for every A € F; 
* hence X € U \ (UF) if and only if X € (\,-¢(U \ A). 


The foregoing informal proof does not justify the permutation of the two statements 
“xX © U” and “for every A € #” because it hides the permutation by placing the 
quantifier at the end of the statements. The following formal proof justifies such a 
permutation by the absence of any free occurrence of A in X € U. 


U\ (UF) = (Neg (U \A) 
t (SD, 
VX {[X € U\ (U-F)] @ K€ Mex (U \A)]} 
t (\), 
VX {[(X € U) A {> (KX EU F)}] @ K€ Mgex(U \A)]} 
t Uf), 
WX {[(X € U) Af{a(GA f(A € F) A (X € A)})}] 
& [(X € Useg(U\A)) AVA{(A € F) = {(X € U)A[-&X EAH} 
t 
WX {[(VA {(X € U) A [> (A € -F) A (X € A))]})] 
@ (Xe U\() F)AVAHA € F)] Vv {& € U) A [>& € AD} 
t 
WX {[(VA {(X € U) A [([7>A € F)] Vv [>& € A)))]})] 
> ((X € U] AAB[(B Ee F) A {7(X € B)})) 
A(VA{[7(A € F)] V {(X € U) A [A(X € AD}}})} 
t 
WX {[(VA {(X € U) A [([-A € F)] Vv [>& € A)))]})] 
@ [(X EU) AVA{[-> € F)] V(X € UNA [>(X € ADH} 


where the first implication, ff, used the assumption that A #~ @ for the formula 
beginning with 4B ...; the last line follows from the distributivity of A over VV. O 


3.53 Theorem. For all sets F and G, intersection distributes over their unions: 


L [(U-7)9(U%)] = [Uans: (Ac F)ABeg)}]. 
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Proof. An informal proof can proceed as follows. 


* For each set X, X € [(U-¥) N (LJ Y)] if and only if X € (J -¥) and X € (UY); 
* equivalently, X belongs to some A € -¥ and X belongs to some B € Y, 

* equivalently, X € (AM B) for some elements A € ¥ and Be Y; 

* equivalently, X e[LJ{ANB: Ac F)A (BE )}]. 


A formal proof can proceed as follows. 


(UFA)NIUY)] 
=[(U{[ANB: Ae F)A(BE¥)}] yet unproved, 
¢  extensionality, 
VX EX € (UF) (UP) 
& {Xe [U{ANB: Ae F)A (BE F)}}) 
{ definitions: N, LU, 
WX (IX € (UF) A Xe US)]} 
<> [HA(AB{(A € F) A (BEY) A [X € (AN B)}})]) 
¢ definition of L), 
[VX ({JA[(A € F) A (X € A)]} A {AB[(B € Y) A (X € B)]})] 
<> [BA(AB{(A € F) A (BEY) A [X € (AN B)]})| 
¢ no free A or B, 
[VX (JA{SB[(A € F) A (BEY) A(X ECA)A (XE B)]})] 
< [BA(AB{(A € F) A (BEY) A [X € (AN B)]}) | 


which holds by definition of A MN B (theorem 3.43). oO 


3.54 Theorem. For all sets F and G, union distributes over their intersections: 


H(N 4) u(A9)]=[Naus: ae Areca]. 


Proof. An informal proof can proceed as follows. 


¢ For each set X, X € [(()-¥) U (()Y)] if and only if X € (() F) or X € (1) 9); 
* equivalently, X belongs to every A € ¥ or X belongs to every B € Y, 

* equivalently, X € (A U B) for all elements A € ¥ and Be GY; 

* equivalently, X e[(}{AUB: Ae F)A (BE )}]. 


A formal proof can proceed as follows. 
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(NAVUM4)| 
= [(fAUB: (Ac F)A(BEY)}] 
t 
WX (EX €[( AU YA) 
© {Xe [f{AUB: (Ae F)A (BE F)}}}) 
t 
Vx (ke (VA v ke M4} 
 [VA(VB{[(A € F) A (B € Y)] = [X € (AUB)}})]) 
t 
[VX ({VA[(A € F) > (X € A)]} V {VB[(B € Y) => (X € B)]})] 
> [VA(VB{[(A € F) A (B € Y)] => [X € (AUB)}})] 
t 
{VX [VA (VBI[(A € F) > (X EA) V [(BEY) = (X € B)]})]} 
 [VA(VB{(A € F) A (BEY) A [X € (AU B)]}) | 
t 
{VX [VA (VB{[(A € F) > (X EA) V [BE Y) > (X € B)]})]} 
> [VA(VB{[(A € F) A (B € Y)] = [(X € A) v (X € B)]}})| 
t 
{VX [VA (VBI[(A € F) > (X EA) V [(BEY) > (X € BD} 
& [VA(VB{[(A € F) A (B € Y)] => [X € (AUB)}})] 
which holds by definition of A U B (theorem 3.33). 
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unproved, 
(SI), 


def. U, (), 


definition: (), 


no free A, B, 


[(P)AQ)|=> (P), 


definition: U, 


oO 


Besides union, intersection, and difference, another operation with sets is useful. 


3.55 Definition (symmetric difference). The symmetric difference of any sets A 


and B is denoted by A A B and defined by 


A AB:= (AUB) \ (ANB). 


3.4.4 Exercises on Unions and Intersections of Sets 


3.51. List all the elements of {2, 3, 7} U {3, 5, 7}. 
3.52. List all the elements of {4, 6, 8} U {4, 8, 9}. 
3.53. List all the elements of {2, 3,7} M {3, 5, 7}. 
3.54. List all the elements of {4, 6, 8} N {4, 8, 9}. 
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3.55. List all the elements of {2, 3, 7}A{3, 5, 7}. 

3.56. List all the elements of {4, 6, 8} A{4, 8, 9}. 

3.57. Provide an example of a set ¥ such that |) ¥ = F. 

3.58 . Provide an example of a set Y such that ()Y = Y. 

3.59. Provide an example of a set Y such that |) ¥ 4 F. 

3.60. Provide an example of a set Y such that (\% 4 Y. 

3.61. Provide examples of sets B and ¥ such that B € ¥ but B é |) F. 

3.62. Provide examples of sets B and ¥ such that B € Y but ()¥ ¢ B. 

3.63 . Provide examples of sets A, B, X, Y such that X¥ € A and Y € B but (XUY) ¢ 
(A U B). 

3.64. Provide examples of sets A, B, X, Y such that X¥ € A and Y € B but (XNY) ¢ 
(AN B). 

3.65. For each set A, prove that {A} = A. 

3.66. For each set A, prove that |) A(A) = A. 

3.67. Prove that VAVBYC {[(A U B)N C] = [((AN C)U (BN C)}}. 

3.68 . Prove that VAVBVC {[(AN B) UC] = [(AUC)N (BU C)}}. 

3.69. Prove that VA[(A U @) = Al]. 

3.70. Prove that VA[(A M @) = @]. 

3.71. Prove that VA[(A U A) = A]. 

3.72. Prove that VA[(A MN A) = A]. 


Prove the following formulae for all subsets A and B of each set U. 


3.73. 
3.74. 
3.75. 
3.76. 
3.77. 
3.78 . 
3.79 . 
3.80. 
3.81. 
3.82. 


[U\ (AN B)] = (U\A)U(U\B) 
[U \ (AUB)] = (U\A)N(U\B) 
[(A \ B) \ U] = (A\ U)\ B\ UV) 
[U \ (A\ B)] = (U\A)U(UNB) 
[(A UB) \ U] = (A\ UV) U(B\ U) 
[AN B)\ U]=(\UNB\U) 
[UN (A\ B)] = (U\ B)\ U\A) 
Prove or disprove that ([A(A)] U [A(B)]) C P(A UB). 
Prove or disprove that ([A(A)] N [A(B)]) > AIAN B). 
Prove or disprove that ([A(A)] N [A(B)]) C BAN B). 
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3.83. 
3.84. 
3.85. 
3.86. 


3.87. 


3.88. 
3.89. 
3.90. 
3.91. 
3.92. 
3.93. 
3.94. 
3.95. 
3.96. 
3.97. 
3.98. 
3.99. 


3.100. 
3.101. 
3.102. 
3.103 . 
3.104. 
3.105 . 
3.106. 
3.107. 


Prove or disprove that ([A(A)] U [A(B)]) > A(A UB). 

Prove or disprove that ([A(A)] \ [A(B)]) € A(A \ B). 

Prove or disprove that ([A(A)] \ [A(B)]) > A(A \ B). 

For each nonempty set # C A(U), and each B C U, prove that 


(7) uB= (\(AuUB). 


ACF 
For each set ¥ C A(U), and for each subset B C U, prove that 


(U#)nB= LJ) ane). 


ACF 


Prove thatA A @ = A. 
Prove thatA AA = ©. 
Prove that the symmetric difference is associative: (AAB)AC = AA(BAC). 


Prove that the symmetric difference commutes: A A B= BAA. 
Prove or disprove that [(AAC) U (BAC)] € [(A U B) AC]. 
Prove or disprove that [(AAC) U (BAC)] > [(A U B)AC 
Prove or disprove that [(AAC) N (BAC)] € [(AN B)AC]. 
Prove or disprove that [(AAC) N (BAC)] D [(AN B)AC]. 
Prove or disprove that [((AAC) \ (BAC)] © [(A \ B) AC]. 
Prove or disprove that [((AAC) \ (BAC)] > [(A \ B) AC]. 
Prove or disprove that [(A U C)A(B U C)] © [(AAB) U C]. 
Prove or disprove that [(A U C)A(B U C)] 2 [(AAB) U C]. 
Prove or disprove that [(A 1 C)A(BN C)] € [(AAB) NC]. 
Prove or disprove that [((A N C)A(BN C)] D [(AAB) NC]. 
Prove or disprove that [(A \ C)A(B \ C)] © [(AAB) \ C]. 
Prove or disprove that [(A \ C)A(B \ C)] > [(AAB) \ C]. 
Prove or disprove that [(C \ A)A(C \ B)] € [C \ (AAB)]. 
Prove or disprove that [(C \ A)A(C \ B)] > [C \ (AAB)]. 
Prove or disprove that ([A(A)]A[A(B)]) C A(AAB). 
Prove or disprove that ([A(A)]A[A(B)]) > A(AAB). 


a a 
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3.108 . Prove that if AM B = @, then A U B = (A \ B)U(B \ A). 
3.109. Prove that AU B = (A A B)U(ANB). 
3.110. Prove that A A B = (A \ B)U(B \ A). 


3.5 Cartesian Products and Relations 


Beyond logic and sets, much of mathematics consists of connections between types 
of sets called Cartesian products, mathematical functions, and mathematical rela- 
tions. These types of sets allow for mathematical specifications, analysis, synthesis, 
and processing of such concepts as graphs, maps, algorithms, and rankings. 


3.5.1 Cartesian Products of Sets 


Cartesian products contain certain sets with two elements. Whereas {X, Y} = {Y, X} 
for all sets X and Y, however, some situations require a method for listing the 
elements of a set in a specific order by means of “ordered pairs” or otherwise, for 
example, in geography and in navigation as in figure 3.2. The following definition 
(attributed to Wiener and Kuratowski [83, p. 455]) and theorem 3.58 derive such 


Fig. 3.2. If X & Y, then (X,Y) ¥ (Y.X). 
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ordered pairs from sets, which shows that the concept of ordered pairs does not 
require any additional axiom. 


3.56 Definition (ordered pairs). For all sets X and Y, the ordered pair (X, Y) is 
the set defined by three applications of the axiom of pairing as 


(X,Y) = {{X}, {X, YH}. 


X is the first coordinate of (X, Y), and Y is the second coordinate of (X, Y). 
3.57 Example. 


¢ If X = Oand Y = 1, then (X, Y) = {{0}, {0, 1}}. 
¢ If X = 1 and Y = 0, then (X, Y) = {{1}, {1, 0} = {{1}, {0, 1}}. 
* IfX =Oand Y = 0, then (X, Y) = {{0}, {0, OF} = {{O}, {O}} = {{O}}. 


The following theorem confirms that, in contrast to sets with two elements, 
ordered pairs record the order of their coordinates. 


3.58 Theorem. For all sets X and Y, if X # Y, then (X,Y)  (Y,X). 


Proof. An informal proof can consist in showing that the two sets (X, Y) and (Y, X) 
contain different elements, whence (X, Y) 4 (Y,X). 


¢ For all sets X and Y, X € {X} and Y € {Y} by pairing (S3); 
° ifX AY, then X ¢ {Y} and Y ¢ {X}, by pairing (S3); 
¢ hence if X # Y, then {X} 4 {Y} by extensionality (S1); 
¢ from X ¢ {Y} and X € {Y, X} it follows that {X} 4 {Y, X} by (S1); 
° from {X} A {Y}, {X} # {Y, X} follows {Xx} ¢ {{Y}, {Y, X}} = (¥,X); 
2 yet {X} © {{X}, {X,Y} = (XY); 
¢ from {X} € (X, Y) and {X} ¢ (Y, X) follows (X, Y) 4 (Y,X), by (S1). 
A formal proof can parallel the same reasoning. 
(1) F VZ[(Z € {X}) © (Z=X)] axiom of pairing (S3), 
(2) FYVZ[(Z 4X) > (Z € {X})] contraposition and transposition, 


(3) (WAX SY ¢€ {X}) specialization Subf~, 

(4) KYA#X hypothesis, 

(5) FY ¢ {x} Detachment; 

(6) F Ye {x,Y} axiom of pairing (S3), 

(7) F {xX} 4 {X,Y} (5), (6), and extensionality (S1); 
(8) FYe{Y¥Y} axiom of pairing (S3), 

(9) - {xX} 4 {Y} (5), (8), and extensionality (S1); 


(10) F {X} € {{Y}, {Y, X}} = (Y,X) (7), (9), and extensionality (S1); 
(11) F {X} © {{X}, {X, Y}} = (X,Y) | pairing (S3), 
(12) - (X,Y) 4 (Y,X) (10), (11), and extensionality (S1). 
oO 
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The definition of the ordered pair (X, Y) holds for all sets X and Y, in particular, 
for all elements X and Y of two sets A and B. The following theorem shows that all 
these ordered pairs (X, Y) are themselves elements of a set. 


3.59 Theorem. For all sets A and B, for each X € A and for each Y € B, the pair 
(X, Y) belongs to P{ P(A U B)]. 


Proof. An informal proof can trace back the definition of the ordered pair (X, Y): 


¢ From X € A follows X € A UB, because A C A U B, whence {X} € P(A U B); 

¢ from Y € Bit follows that Y ¢ A U B, because BC A UB; 

¢ from X €AUBand Y € A U Bit follows that {X, Y} ¢ P(A UB); 

¢ from {X} € P(A UB) and {X,Y} € P(A U B), it follows that {{X}, {X, y}\ € 
P|PA(A UB). 


A formal proof can parallel the foregoing argument. 


EKXEA hypothesis, 

FAC (AUB) theorem 3.40 , 

FX € (AUB) definition of subsets and specialization, 
F {xX} C (AUB) definitions of subsets and singletons, 

F {X} © P(AUB) definition of power sets. 

FYeEB hypothesis, 

FY € (AUB) as for X € (A UB), 

F {X,Y} « P(AUB) X € (AUB) and Y € (AUB), 

r { {X}, {X,Y} } C P(AUB) {X} © P(A UB), {X,Y} € P(AUB), 


F { {X}, {X,Y} }e ALA(AUB)] definition of power set. 
oO 


Because every ordered pair (X, Y) belongs to the set Y[A(A U B)], the axiom 
of separation guarantees that the collection of all such ordered pairs is a set. 


3.60 Definition (Cartesian product). For all sets A and B, the Cartesian product 
of A and B is the set A x B (read “A cross B’’), consisting of all ordered pairs (X, Y) 
with X € A and Y € B. Thus, 


AXxB 
= \ce P|P(AUB)) : AX(4V{(X EA) A(Y EB) A[C = (X, yy})} 
= {(X,Y): (KE A) A(YEB)} 


with AX (SY {(X EA)AYVYEB)A[C= (CQ, Y)}) for P in the axiom of separation 
(S5). The sets A and B are the factors of the Cartesian product A x B. 
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A common graphical representation of a Cartesian product A x B lists all the 
elements of A along a horizontal axis, and all the elements of B along a vertical axis, 
so that the element (X, Y) of A x B appears directly above X and across from Y. 


3.61 Example. For the sets 


A:= {0,1,2}, 
B := {0, 1}, 


the Cartesian product A x B takes the form 
Ax B= {(0,0), (1,0), (2,0), (0, 1), , 1), 2, D}, 
with the following graphical representation: 


B AxB 
1 (0,1) G,1) (2,1) 
0 (0,0) (1,0) (2,0) 


0 1 2 A 


For practice and for future use, the following theorems establish relations 
between Cartesian products and other operations with sets. 


3.62 Theorem. For all sets A, B, C, and D, 
(ANC) x (BND)] = [(A x B)N (Cx D)]. 


Proof. An informal proof can establish that [(A N C) x (BN D)] and [(A x B) Nn 
(C x D)] have exactly the same elements. These two sets are Cartesian products, 
and consequently their elements are ordered pairs. 


¢ Anordered pair (X, Y) is an element of (ANC) x (BND) if and only if X € (ANC) 
and Y € (BND); 

¢ hence (X,Y) € [((ANC) x (BN D)| if and only if X € A and X € C,and Y € B 
and Y € D, 

¢ which is equivalent to X € A and Y € B, and X € Cand Y € D, 

¢ which is equivalent to (X, Y) € (A x B) and (X, Y) € (C x D), 

¢ which is thus equivalent to (X, Y) € [(A x B) N (C x D)]. 


As the preceding informal proof swapped X € C and Y € B, a formal proof can rely 
on the commutativity [(P) A (Q)] <> [(Q) A (P)] of A by theorem 1.57: 
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(X,Y) € (ANC) x (BND) 
REMOTE SOR 

en er ee eee 
[(X EA) A(YEB)] A [(XE nee 
[(X, Y) € (Ax B)] A [(X, ¥) € ae 


t 
(X,Y) € [(Ax B)N(CxD)| 


definition of Cartesian products, 
definition of intersection, 
commutativity and associativity of A, 
definition of Cartesian products, 


definition of intersection. 


3.63 Theorem. For all sets A, B, C, and D, 


[(A x B) U(C x D) U (A x D) U (Cx B)] = [(A UC) x (BUD)]. 


Proof. An informal proof can establish that [(A x B) U(C x D) U (A x D) U(C x B)] 
and [(A U C) x (BU D)] have exactly the same elements. 


¢ An ordered pair (X, Y) is an element of (A x B) U (C x D) U (A x D) U (Cx B) 
if and only if (X,Y) € (A x B), or (X,Y) € (C x D), or (X,Y) € (A x D), or 


(X,Y) € (Cx B), 


¢ which is equivalent to X € A and Y € B, or X € Cand Y € D,orX € A and 


YeD,orX € CandY eB, 


¢ which is equivalent to X € Aor X € C,andY € BorY € D, 
* which is equivalent to (X, Y) € [(A UC) x (BUD)]. 


A formal proof translates the alleged equivalences into the logical theorem 


t[(P) AQ) v [(P) A (S)] v (CR) A (Q)] V [(R) A OS) <> EL) V (RY A [(Q) v (S) I}; 


which follows from the distributivity of A over V (theorem 1.75), with 


(P)@ (XEA), (Q) VEB), 


Thus, 


(X,Y) € [(A x B) U(C x D)U (A x D) U (Cx B) 


[(X, Y) € (Ax B)] V [(X, Y) € (Cx D) 
v(x, ¥) € (A x D)] v [(X, Y) € (Cx B) 


[(X eA) A(Y EB] V(X ECA ED) 


(R)@(XeEC), (S)o (VED). 


¢ definition of L), 


¢ = definition of x, 


V(X € AVA (Y ED) V [KE CAV EBV 


¢ theorem 1.75, 
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[(XEA)V(XEOQCJAL(VYEBV(VYED)| 
{ definition of L), 

[X € (AUC)]A[Y € (BUD)] 
{definition of x. 

(X,Y) € [((AUC) x (BUD)] 


3.5.2 Cartesian Products of Unions and Intersections 


The foregoing theorems generalize to Cartesian products of unions or intersections 
of any sets of sets. 


3.64 Theorem. For all sets of sets F and GY, 


(U7)x(U%= U axe). 


(A.B)E.FxG 


Proof. An informal proof can show that (L) ¥) x (UY) and Uy aerxg(A x B) 
have exactly the same elements. 


¢ A pair (X, Y) is an element of () ¥) x (UY) if and only if X € (J F) and 
ye (U9), 


¢ which is equivalent to X € A for some A € ¥ and Y € Bfor some Be FY , 
* which is equivalent to (X, Y) € (A x B) for some (A, B) € (¥ x Y). 
A formal proof can parallel the foregoing reasoning. 

(X.Y) € (UF) x (UY) 
{definition of x, 
ke(UAIAYe (U9) 
¢ definition of LU, 
{AA[(A € F) A (X EAD} A {AB[(B EY) A(Y € B)| 


ee 


no B, no A, 
FA(AB{[(A € F) A (X EA) A [(BEY) A (Y € B)]} 


— 


properties of A, 
FA(AB{[(A € F) A (BEY) A [(X € A) A (Y € B)] 


PwmSe 


definition of x, 
4A (AB{[(A x B) € (F x Y)] A [(X, Y) € (A x B)}} 


— 


definition of |). 


Be 


(X,Y) € Ucaxae exe) (A x 
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3.65 Theorem. For all sets of sets F and, 


(NA)x(AY= 1) 4x2). 


(A,B)EFXY 


Proof. An informal proof can show that (()-¥) x ((\Y) and (\y pegzxg(A x B) 
have exactly the same elements. 
¢ A pair (X,Y) is an element of (()-F) x (()Y) if and only if X € (().F) and 


Ye ((\¥%), 
¢ which is equivalent to X € A for every A € ¥ and Y € Bforevery Be Y, 
¢ which is equivalent to (X, Y) € (A x B) for every (A, B) € (F x FY). 


The foregoing informal proof glosses over the case in which at least one of ¥ or Y 
is empty, which would require invoking the corresponding unions as supersets of the 
intersections, because of the definition of intersections. One remedy could consist in 
proving such cases separately. Indeed, ¥ = @ or Y = @ if and only if (F x Y) = 
@. A formal proof can parallel the foregoing reasoning with theorem 1.82, 


F {[(P) > QAR) = (SI > 1P) A )] > [(Q) A (SB. 


to prove that 


(NF) x(AY Ss () (A x B), 


(A.B)E.FxG 


with the hypotheses C € ¥ for P and D € & for R for the converse, so that 


F {[(P) > (Q)] A [(R) > (S)]} = ((P) A (RIA {[1) A (R)] > [(Q) A (S)]}). 
Thus, 
&YNEMAxMY) 
¢t definition of x, 
xe (QV FaveMY)] 
$ NH#cU#R, 
({VA[(A € F) => (X € A)]} A {VB[(B € Y) = (Y € B)]}) 
A({AC[(C € F) A (X € C)]} A {AD[(D € Y) A (¥ € D)]}) 
¢ noB,noA, 
[VA(VB{[(A € F) > (X EA) A [BE Y) = (Y € B)}})] 
A[AC(AD{[(C € F) A (X EC) A(DEY) A (¥ € D)}})] 


¢{t theorems, 
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[VA(VB{[(A € F) A (BE Y)] > [(X € A) A (VY € B)]} 
A[AC(AD{[(C € F) A(X EC) A(DEY) A (VY E D)]} 
¢ definition of x, 

[VA(VB{[(A, B) € (F x Y)] > [(X, Y) € (Ax B)}} 
A[AC(AD{[(C, D) € (F x Y)] A [(X, Y) € (C x D)]}} 

{ definition of (). 


(X,Y) € (axayeczxagy(A x B)} 
oO 


3.5.3 Mathematical Relations and Directed Graphs 


The Cartesian product A x B provides a means to draw connections, or, in other 
words, to specify relations, between elements of the two sets A and B. 


3.66 Definition (relation). A relation between elements of sets A and B is a subset 
R CA~xB of their Cartesian product. 


¢ Two elements X € A and Y € B are related with respect to the relation R if and 
only if (X,Y) € R. 

¢ For each relation R C A x B, the domain Y(R) C A of the relation R consists of 
those elements of A related by R to at least one element of B: 


Q(R) = {X EA: AY{(Y € B) A [(X, Y) € R}}}. 


¢ Similarly, the range #(R) C B of the relation R consists of those elements of B 
related by R to at least one element of A: 


BR) = {Y € B: AX{(X € A) A [(X, Y) € R]}}. 


Another common notation for (X, Y) € Ris XRY, especially if such a special symbol 
as C, &, or = denotes the relation. Relations may also be represented by graphs. 

3.67 Definition (directed graph). A directed graph is an ordered pair G := 
(V, E) with a relation E C V x V between elements of the same set A = V = B. 
Elements of V are called vertices. A pair (X, Y) € E is called an edge from X to Y: 


ee) (X,Y) () 


3.68 Example. For each set A, the diagonal A, C A x A is the subset 


A, = {((X,X) € AXA: X EA}. 
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Thus, the diagonal corresponds to the relation called “equality,” or, equivalently, 
“identity”: a pair of elements (X, Y) lies on the diagonal Ay, if and only if X = Y. 
Because X = X, and hence (X,X) € Ag, for every X ¢€ A, it follows that the 
diagonal A, relates every element of A to itself, whence A(R) = A = &(R). For 
example, if 


A:= {@, {o}}, 
then 


_ (SD, {S}) (P}, {P}) 
AxA=| (2,2) ({2}.2) |. 


Aj ({2}, {2}) . 


(9, ) 


because the only pairs (X, Y) with X = Y are the pairs (@, @) and ({@}, {@}). Asa 
graph, the diagonal consists of a single loop at each vertex: 


(Xx) Co) 


3.69 Example. For all sets H and K, consider their respective power sets A := 
Y(H) and B := Y(K). The relation R C A(H) x A(K) of inclusion consists 
of all, but only those, pairs (V, W) of subsets V C H and W C K such that V C W: 


R={(V,W) € AH)x AK): VEOH AWEK)AVEW)}. 


The domain of the relation € consists of all subsets of A included as a subset in at 
least one subset of B. Similarly, the range of the relation C consists of all subsets of 
B that contain as a subset at least one subset of A. 


3.70 Example. For all sets H and K, let A := A(A) and B:= Y(K). The relation 
SC P(A) x A(K) of strict inclusion consists of all, but only those, pairs (V, W) 
of subsets V C H and W C K such that V C W but V F W: 

S={(V,W) € PU) x AK): VEOH AWEK)AVEOW AV EW) 
The strict inclusion (V C W) A (V 4 W), is also denoted by V C W or by V & W: 
FI(VOWAVEAW]SVCW)S(VEW). 

The domain of the relation of strict inclusion consists of all subsets of A included 


in, but not equal to, at least one subset of B; its range consists of all subsets of B that 
contain, but do not coincide with, at least one subset of A. 
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3.71 Example. For all sets A and B, the relational constant € relates every element 
X ¢€ A that is an element of some Y ¢€ B. Denote this relation on A x B by E. 
(The symbol € cannot denote a subset E of A x B because € is neither an individual 
variable nor an individual constant.) Thus, EF C (A x B) is defined by 


(X,Y Ee ELS [(X eA AX EVA EB). 


For example, if 


then 


_§ 4) (}.{0)) 
AxB=| (2.2) ({2}.9) ; 


we (@, {@}) . 


because in A x B the only pair (X, Y) such that X € Y is the pair (@, {@}). 
3.72 Example. For the sets 


A:= {@, {o}}, 
B:= |, {2}, { 2,{2}}1, 


the Cartesian product A x B and the relation E take the forms 


(2, { 3, {@} }) (123. { 2, {2}}) 
AxB= (2, {o3) ({@}, {O}) : 
(S, 2) ({O}, D) 


(2, { S,{@} }) {O}, { S, {}}) 
E= (S,{@}) ; 


because in A x B the only pairs (X, Y) such that X € Y are those just displayed. 


In some contexts a relation R C A x B may also prove useful with B and A listed 
in the reverse order. Then the “inverse” relation R°~'! C B x A contains similar pairs 
as R does but with coordinates listed in the reverse order. Some texts denote the 
inverse relation by R—!, which can cause confusion because the same notation also 
represents reciprocals in arithmetic and algebra. The notation adopted here for the 
inverse relation, R°~!, conforms to [70]. 
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3.73 Definition (inverse relation). The inverse of a relation R C A x B between 
two sets A and B is the relation R°~' C B x A defined by 


R°"! = {(¥,X) © BxA: (X,Y) € R}. 


3.74 Example. The inverse of the diagonal is the same diagonal. 
3.75 Example. The inverse of the subset relation C is the superset relation D. 


3.76 Definition (composite relation). The composition of two relations RC AxB 
and § C Bx C is the relation So R on A x C defined as follows. The composite 
relation So R relates an element U € A to an element W ¢€ C if and only if R relates 
U to some element V € B and S relates the same element V to W € C: 


(SoR) = {(U,W) € (Ax C): AVIV € BJA [(U, V) € RJ A [(V, W) € S])}. 


3.77 Example. For each relation R C A x B between any sets A and B, the 
composition R o R°~! contains the diagonal A ir) for the range of R. Indeed, by 
definition of its range, R relates every Y € &(R) to an element X € A so that 
(X,Y) € R; then the inverse R°~! relates Y to X so that (Y,X) € R°-!. From 
(Y,X) € R°! and (X,Y) € R it follows that (Y,Y) € (Ro R°!) for every 
(Y,Y) € Ag). Therefore Agr) C (Ro R°—'). Similarly, R°~! o R contains the 
diagonal A gr) for the domain of R. Indeed, by definition of its domain, R relates 
every X € JY(R) to an element Y € B so that (X,Y) € R; then the inverse R°! 
relates Y to X so that (Y,X) € R°!. From (X, Y) € Rand (Y,X) € R°"! it follows 
that (X,X) € (R°“! o R) for every (X,X) € Agcy. Therefore A gr) C (R°7! © R). 


Because every relation R between sets A and B is a subset R C (AB), operations 
with sets apply to all relations. 


3.78 Definition (unions and intersections of relations). For all sets A, B, C, E, 
and for all relations R C A x Band T C C x E, the union of the relations R and T 
is the relation R U T between A U C and BU E, so that 


(RUT) := {(X,Y) € [(AUC) x (BUE)]: [(X, Y) ER] V [(X, Y) € TI}. 


Similarly, the intersection of the relations R and T is the relation RM T between 
AMC and BN E, so that 


(RN T) = {(X,Y) € (ANC) x (BNE): [(X, Y) € RJA [(X, Y) € T]. 


A particular instance of intersections of relations R C (A x B) and T C (A x B) 
consists of the intersection of R with a subset S C A and its Cartesian product with 
B. The “restriction” of R to S is then the relation RNT with T = (Sx B). The concept 
of the “restriction” of a relation to a subset is useful if the subset has characteristics 
that are useful for the purpose at hand while the complement of the subset does not. 
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3.79 Definition (restricted relation). For each relation R C (A x B), and for each 
subset S C A of A, the restriction of R to S is the relation R|s C (S x B) defined by 


R\s = RO (Sx B) 
= {(X,Y)ER: X ES}. 
Thus, Rls restricts its first coordinates only to those elements of S. 


There also exists a similar instance of intersections of relations R C (A x B) and 
T C (A x B) as the intersection of R with a subset V C B and its Cartesian product 
with A. The “restriction” of R to V is then the relation RM T with T = (A x V). 
Similarly, the restriction of R to S C A and V C Bis the relation RM (S x V). 


3.5.4 Exercises on Cartesian Products of Sets 


3.111. Determine whether the Cartesian product is associative. 

3.112. For each set A, prove that A x @ = @, and that @ x A= @. 

3.113. Prove that A x B = @ if and only ifA = Gor B= @. 

3.114. Prove that x distributes over N: [(A x B) N (C x B)] = [(AN C) x BI. 
3.115 . Prove that x distributes over U: [(A x B) U (C x B)] = [(AU C) x B]. 
3.116. Prove that [(A \ C) x B] = [(A x B) \ (C x B)]. 

3.117. Prove that [A x (B \ D)] = [(A x B) \ (A x D)]. 

3.118 . Prove or disprove that [(A \ C) x (B \ D)] © [(A x B) \ (C x D)]. 
3.119. Prove or disprove that [(A \ C) x (B \ D)] > [(A x B) \ (C x D)]. 
3.120. Prove or disprove that [(AAC) x (BAD)] € [(A x B)A(C x D)]. 
3.121. Prove or disprove that [(AAC) x (BAD)] > [(A x B)A(C x D)]. 
3.122. Prove or disprove that ([A(A)] x [A(B)]) © AA x B). 

3.123 . Prove or disprove that ([A(A)] x [A(B)]) > A(A x B). 

3.124. Provide a formula for the inverse of the relation of inclusion. 

3.125 . Give a formula for the inverse of the relation of strict inclusion. 


3.126. For each relation R C A x B, and for each subset S C A of A, prove that 
R|ls = RO (Sx B). 


3.127. For all sets A and B, prove that @ is an element of the domain F of the 
relation C on A(A) x AB). 
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3.128. For all sets A and B, prove that Y(B) is the range & of the relation C on 
P(A) x AB). 


3.129. Provide examples of sets A and B, such that A(A) is the domain J of the 
relation C on P(A) x A(B). 


3.130. Provide examples of sets A and B, such that A(A) is not the domain F of 
the relation C on Y(A) x Y(B), so that J S P(A). 


3.6 Mathematical Functions 


Functions are relations relating exactly one element of their domain to each element 
of their range. 


3.6.1 Mathematical Functions 


In some applications of relations, the domain and the range can contain measure- 
ments. 


3.80 Example. Results from astronomical observations can consist of a relation 
between two coordinates of position, with ordered pairs (X,Y) where X is the 
observed ascension (elevation) and Y is the observed declination (azimuth) of an 
asteroid. For example, the following pairs (X, Y) record the ascension X and the 
declination Y of the asteroid Pallas, measured by Baron von Zach about 1800 A.D. 
[14, p. 5]: 


(0,408) (30,89)  (60,-66) (90,10) (120,338) (150,807) 
(180, 1238) (210, 1511) (240, 1583) (270, 1462) (300, 1183) (330, 804) 


with the corresponding graphical representation in figure 3.3. 


Results from astronomical observations can also consist of a relation between 
time and position, with ordered pairs (JT, Y) where Y is the observed position 
(declination or azimuth) of a planet at time 7. Such a relation has the following 
properties. 


¢ If no observation was made at some time 7, then the results do not contain any 
ordered pair with T as their first coordinate. 

¢ Each observation yields only one position Y at any time 7. 

¢ Observations can yield the same position Y at several times, for instance, if the 
motion is periodic. 


Mathematical “functions” are relations corresponding to such applications. 
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Fig. 3.3. Positions include at most one declination for each ascension. 


3.81 Definition. A function from a set A to a set B is a relation F C (A x B) such 
that for each X € A there exists at most one Y € B for which (X, Y) € F. 

The domain of F is the subset Z(F) C A consisting of every X € A for which 
there exists some Y € B such that (X, Y) € F. 


QF) = {X €A: AY[(Y € B)A {(X,Y) € FH]}. 


The range of F, denoted by &(F), consists of every Y € B such that there exists 
some X € A such that (X, Y) € F: 


BF) = {Y € B: AX[(X € A) A {(X,Y) € FH}. 


Moreover, B is called the co-domain of F. 

For each X € Y(F), the unique Y such that (X, Y) € F is called the value of F 
at X, or also the image of X by F’. The same Y is denoted by F(X) (read “F of X”’); 
thus, Y = F(X) if and only if (X, Y) € F. The element X is called the argument of 
F in the expression F(X). The notation 


F:A->B 


(read “F maps A to B’’) means that F is a function from A to B; the notation X > 
F(X) (read “F maps X to F(X)”) may specify F(X) by a formula or otherwise. 
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3.82 Remark. A variant of the concept of a function from a set A to a set Bisa 
subset F C (A x B) such that for each X € A there exists exactly one Y € B for 
which (X, Y) € F, and then the notation F : A — B implies that A(F) = A. 

The requirement that Y(F) = A remains harmless with simple examples of 
functions, but it presents unnecessary obstacles with more realistic examples of 
functions, whose complexity can make the domain difficult or impossible to identify, 
especially if the identification of the domain is irrelevant to the task at hand. 

The specification with “at most” in definition 3.81 is common in set theory 
[128, p. 58 & 86], algebraic geometry (especially with “rational” functions) 
[136, p. 34-35], complex analysis (especially with “meromorphic” functions) [1, 
p. 128], and functional analysis [130, p. 4 & 18], in instances where several 
functions called “operators” have different domains but arise from a relation 
common to all of them. 


3.83 Example. The pairs in example 3.80 define a function F : A — B from the 
set of ascensions A := {0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 330} to the 
set of declinations B := {—66, 10, 89, 338, 408, 804, 807, 1183, 1238, 1462, 1511, 
1583}. This function F : A — B has domain A, codomain B, and range B. 


3.84 Example. With A := {6,7,8, 9} and B := {1, 2, 3, 4}, let 
F := {(6, 2), (7, 1), (8, 2), (9, 3)}- 
Then F : A > Bisa function with domain A, codomain B, and range {1, 2, 3}. 


3.85 Example. For all nonempty sets A and B, and for any element Z ¢€ B, there 
exists a constant function 


Cz:A—>B, 
Xb Z. 


The constant function Cz maps every element X € A to the same value Z, so that 
Cz := {(X,Z): X € A}. 


In particular, (Cz) = A for the domain of Cz, and (Cz) = {Z} for its range. 


3.86 Example. For each set A, there exists an identity function 


I,: AA, 
Xr xX. 


Thus the identity function 7, maps every element to itself, so that 
I, := {(X,X): X € A}. 


In particular, Y(J4) = A and Z(I4) = A, because (X, X) € J, for every X € A. The 
function J, is also denoted by Ay and is called the the “diagonal” of A x A. 


www.pdfgrip.com 


3.6 Mathematical Functions 157 


3.87 Example. For all sets A and B, the canonical projection functions from the 
Cartesian product A x B into its factors A and B are the functions P4 and Pg with 


Py: (AxB)—-A, 
(X,Y) X; 


Pp: (Ax B)—>B, 
(X,Y) Y. 


Thus, P, maps (X, Y) to its first coordinate X in A, whereas Pg maps (X, Y) to its 
second coordinate Y in B. The domain of P,4 and Pz is A x B, but the range of P, is 
A, whereas the range of Pz is B. 


3.88 Example. For all sets A and B, for any subset V C A and any element Z € B, 
the slice function Sy 7 maps each element X € V to (X,Z) € (A x B): 


Sv.z :VoO> (A x B), 
Xr (XZ). 


Thus, Sy.z maps its domain V to the slice V x {Z} in A x B. 


3.89 Example. For each set A and each subset S C A, the characteristic function 
Xs, with the Greek letter y (read “chi’”), maps every element of the subset S to 1, and 
every element outside the subset S to 0, (with O = @, 1 = {@}, and 2 = {0, 1}): 


Xs: A> 2, 
lifx eS, 
Oifx €éS. 


The following theorem provides a means to compare two functions to each other. 


3.90 Theorem. Two functions F: A— BandG: A — Bare equal, F = G, if and 
only if they have the same domain D C A and F(X) = G(X) for every X € D. 


Proof. This proof rewrites F(X) = G(X) in terms of the definition of functions: 


F=G 
¢ extensionality, 
VX(VY {[(X, Y) € F] = [(X, Y) € G}}) 
~~ functional notation. 
VX(VY {[Y = F(X)] } [Y = G(X)]}}) 4 
Some situations involve only parts of a function, or combinations of several 
functions, for instance, as defined by the concepts introduced here. 
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3.91 Definition (restriction of functions). For each function F : A — B, and for 
each subset S C A of A, the restriction of F to S is the function F|s C Sx B 
defined by 


F\s :S—> B, 
is FO: 
Thus, F|s(X) = F(X), but F|s maps only the elements of SN A(F). 
3.92 Example. With A := {6, 7,8, 9} and B := {1,2, 3, 4}, let 


F := {(6, 2), (7, 1), (8, 2), (9, 3)}. 


Then the restriction of F to the subset S := {6, 8} is F|s = {(6, 2), (8, 2)}. 


3.93 Example. For each set B, and for each subset W C B, the inclusion function, 
denoted by the Greek letter 1 (“iota”), is the restriction of the identity function to W: 


lw:W-B, 
XwxX., 


Thus tw : W — Bis the restriction of Jz : B — B to the subset W C B. 


3.94 Definition (union of functions). For all disjoint sets A and C, so that ANC = 
@, for all sets B and E£, and for all functions F: A ~ Band G: C — E, the union 
of the functions F and G is the union FUG of the two sets F C Ax Band G C CxE: 


FUG : (AUC) > (BUE), 


Xb F(X) if X € D(F) CA, 
G(X) if X € W(G) CC. 

3.95 Example (overloading operators). In a computer language such as C++ 

“overloading” the addition (+) from numbers to pairs of numbers corresponds to 

forming the union of the addition function F defined on a set A of numbers and an 

addition function G defined on a set C of pairs of numbers. 


3.96 Definition (intersection of functions). For all sets A, B, C, E, and all 
functions F: A — BandG: C — E, the intersection of the functions F and 
G is the intersection FM G of the two sets F C A x Band G C Cx E, so that 


FONG:D-— (BNE), 
Xb F(X) = G(X), 


with domain D := Y(F NG) = {X € ANC: F(X) = G(X)}. 
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3.6.2 Images and Inverse Images of Sets by Functions 


Some situations involve the images by functions of not only single elements in the 
domain, but also subsets of the domain, as defined here. 


3.97 Definition (image of a set). For each function F : A — B and each subset 
V CA, the image of V by F is the subset F”’(V) consisting of all images of each 
XEVN LF) by F: 


F’(V) = {Y¥ €B: AX [(X € V) A (F(X) = Y)]} 
= {F(X): XE VN AFP}. 


For each function F : A > B, F” is a function of subsets: F”: A(A) > A(B). 


3.98 Remark. The notation F”(V) adopted in definition 3.97 is common in set 
theory [74, p. 14], [128, p. 65]. Informal usage employs the notation F(V) for the 
image of a subset V C A by a function F : A — B, but this usage is ambiguous. For 
example, consider the set A := { @, {OD} \ and the constant function 


Co: { @, {@}} > { 2, {o}}, 
XPOS. 


The set {9} is an element of A, whence Cg({@}) = @ because Cg(X) = @ for 
every X € A. Yet {@} is also a subset of A, containing the single element @ € {@}; 
because Cg(@) = @ it follows that C’g({@}) = {G} as the image of a subset: 


Co({@}) = 2, 
C’a({S}) = {OP}. 


The common informal notation leads to the contradiction 


Co({@}) = @, 
Co({S}) = {O}. 


In a formal theory containing this contradiction, every proposition would be True. 


3.99 Example. If Cz : A — Bis a constant function that maps each X € A to the 
same Z € B, then Cz”(V) = {Z} for each nonempty subset V C A. 


3.100 Example. Vf I, : A — A, X + X is an identity function, so that [,(X) = X. 
for each X € A then J,”(V) = V for each subset V C A. Thus, J,” is the identity 
function I,” = Iga): P(A) > Y(A), with V+ V for every V € P(A). 


Besides images of subsets, such problems as the solution of equations involve 
the identification of a subset, called a pre-image, mapped to a specified image. 
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3.101 Definition (pre-image). For each function F': A — B, and for each element 
Y ¢€ B, the pre-image of Y by F is the subset of A denoted by F°~!”({Y}) and 
consisting of all elements X € A such that F(X) = Y: 


F°-!*({v}) = {XK © A: F(X) = ¥}. 


For each subset W C B, the inverse image, or pre-image, of W by F is the subset 
of F°—!”(W) in A and consisting of all pre-images of all elements of V by F: 


F°-|"(W) = {X € A: F(X) € W}. 


3.102 Example. If Cz: A > is a constant function that maps each X ¢€ A to the 
same Z € B, then C3~'”({Z}) = 

Also, cz !"(W) = A for 2. subset W C B for which Z € W. 

In contrast, co "(8) = © for each subset S C B for which Z ¢ S. Thus, 


A ifZe W, 


O—1>5 = 
a, @ifZ ¢ W. 


3.103 Example. Vf I, : A — A is an identity function, with 4(X) = X for each 
X € A, then [$~!”(W) = W for each subset W C A. 


Theorem 3.104 relates images and pre-images to unions and intersections. 


3.104 Theorem. For each function F : A —> B, for each set ¥ of subsets of A, and 
for each set G of subsets of B, the following relations hold. 


9 =UrRW 


Weg 


() Fw) 


Weg 


a 
F) = J ry) 
F< 


Foo! ” 


F” 
VEeF 


() F”(V). 


VEeF 


(U 
Fool» (a 
aa (© 
ae (@ 
Proof. Apply the definitions of union and intersection. For F°~!” (LY), 
F°"'"(QY) = Uwey F°-'"(W) yet unproved, 
¢  extensionality (S1), 
VX {[xe Fo !"(UY)| 
< [X € Uweg FW) ]} 
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{¢ definitions of F°~!” and L, 
VX {[F(X) € (UY)! 
<> (AW {(W € Y) A [F(X) € W]})} 
{ definition of L), 
VX {(AW {(W € Y) A [F(X) € W]}) 
> (AW {(W € Y) A [F(X) € W]})} 
which holds thanks to theorem 1.63: (P) & (P). For F°"!” (1), 


F°"!"((\Y) = (\weg F°-'”’(W) yet unproved, 
¢  extensionality (S1), 
VX {[xe Fo !"(9)| 
 [X € Qweg FW) ]} 
{¢ definitions of F°~' and (), 
VX {[F(X) € ()9)] > 
(VW {(W € Y) = [F(X) € W)})} 
{definitions of (), 
VX {(VW{(W € Y) => [F(X) € W]}) 
> (VW {(W € Y) = [F(X) € W)})} 
which holds thanks to theorem 1.63: (P) <> (P). For F” (UF), 


F°(UF)=UvegF’(V) yet unproved, 

¢  extensionality (S1), 
vy {yer (UF)] @ [Y € Uvex FW) ]} 

{ definitions: F’, LU, 
VY[(AX{IX € (U A) A [Y = F)]}) 
& (AV {[V € FJ A[Y € F’(V)}})] 

} definitions: ), F’”, 
VY[(AX {4V[(V € F) A (X € VN} ALY = F(X))) 
© {AV[(V € F) A (AX{(X € V) A [Y = F(X)]}})]} 

¢ noxinVe F 

VY[(AX {4V[(V € F) A (X € VN} A [Y = F(X))) 
& (AV{aX[(V € F) A {(X EV) A[Y = F(X)}3]}) | 


which holds thanks to the associativity of A (theorem 1.66). For F” ((|-¥), 
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F’((\F) S\\yergF(V) yet unproved, 
¢ definition 3.7, 
vY {Yer (QA) => [Y¥ ever F'(V)]} 
{ definitions: F’, (), 
VY [IX X € (QA) ALY = FOO} 
=> (wV{(Ve F) => [Ye F(V)]}) | 
{ definitions: (), F’’, 
vy ({ax [VV (Ve ¥] => [XeEV)JA[Y =F} 


= {VV[(V © F) => (ax{(X eV) A TY = F@)]})]}) 


=> 


theorem 2.63: noXin(V é€ F), 
vy ({ax VV (Ve ¥] => [XeEV)JA[Y =F) 


{vvax[Ve F) => (KEW AY = F(x))})]}) 
P Q R 


which holds thanks to {[(P) > (Q)] A (R)}} => {(P) => [(Q) A (R)]} by 
theorem 1.84, and {AX[VV(W)]} => {VV[AX(W)]} by theorem 2.73. Oo 


3.6.3 


3.131. 
3.132. 
3.133 . 


3.134. 


3.135 . 
Z 


3.136. 
3.137. 
3.138 . 


Exercises on Mathematical Functions 


Determine whether F := {(0, 1), (2, 3), (1, 2), (0, 4)} is a function. 
Determine whether G := {(9, 2), (7, 3), (8, 2), (6, 1)} is a function. 


Determine whether the following relation is a function: 
R:= {(0, 1), C, 2), (2, 4), (3, 8), (4, 8), (5, 4), (6, 2), (7, 1), (8, 0)}. 
Determine whether the following relation is a function: 
S:= {(0, 2), (1,5), (2,7), (3, 5), (4, 2), (5, 5), (6, 7), (7, 5), (8, 2)}. 


Determine whether the following relation is a function: 


:= {(0, 0), (1, 0), (2, 0), (3, 0), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (9, 0)}. 


Prove that exactly one function exists from A := @ to B:= ©. 
Investigate whether a function F : @ — B exists from @ to any set B. 


Investigate whether a function F : A — © exists from any set A to ©. 
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3.139 . For each set A let 14 denote the constant function 1: 
WwW: A> {lh}, In@=1. 


For each subset B C A, prove that 14|g coincides with yz. 


3.140. For all subsets V, W of each set A, investigate whether the characteristic 
function of the intersection, yvynw : A — 2, is the intersection of the two 
characteristic functions yy : A > 2 and yw: A — 2, so that yynw = Xv" Yw. 


3.141. For all subsets V, W of each set A, investigate whether the characteristic 
function of the union, yyuw : A — 2, is the union of the two characteristic functions 
Xv: A—2and yw: A > 2, so that yyuw = Xv U Yw. 


3.142. Prove that for each function F : A — B, for all subsets R, S C A, and for all 
subsets V, W C B, the following relations hold. 


F°-!"(VU W) = F°"!"(V) U F°-!”(W) 

F°-!"(VN.W) = F°-1"(V) 0 F°!"(W) 
F’(RU S) = F”(R) UF"(S) 
F’(RNS) C F(R) NF"(S). 


3.143 . Provide an example for which F(R S) & F(R) 0 F’(S). 


3.144. For each function F : A — B and for all subsets V,W C B, prove that 
Fo!" \ V) = [FW] \ [FDL 


3.145. For each function F : A — B and for all subsets H, K C A, investigate 
whether inclusion or equality holds for F”(K \ H) and [F”(K)] \ [F’(A)]. 


3.146. For each function F : A — B, prove that F’(V) = & (Fly). 


3.147 . For each function F : A — B, prove that F” : A(A) + YB) contains all 
the information about F, in the sense that {F(X)} = F”({X}) for every X € A. 


3.148 . For each function ¥ : Y(A) > A(B), investigate whether there exists a 
function F : A > B such that F” = F. 


3.149 . Consider the function F : A — B defined by 


A:=3 = {0,1,2} = { @, {B}, {S, {o}} , 
B:= A, 
F := {(0, 2), (1, 2), (2, 0)}. 
Recall that the superscript ” indicates images of subsets (rather than of elements). 


(1) {@, {@}} is a subset of A. Find its image: F”({@, {@}}). 
(2) {@, {O}} is a subset of B. Find its inverse image Peli, {@}). 
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(3) {@, {@}} is an element of A. Find its image F({@, {@}}). 
(4) {2, {G3} is an element of B. Find F-'"({{a, {2}}}). 


3.150. Consider the function G: C — D defined by 


VeH4=101235) 
B, {SO}, {S, {OH}, | 2, {O}, {@, {oH} . 


D:=C, 
G := {(0,3), (1,0), (2, 3), (3, 1}. 


Recall that the superscript ” indicates images of subsets (rather than of elements). 


(1) 3 isa subset of C. Find its image: G”(3). 

(2) 3 is a subset of D. Find its inverse image G°~!”(3). 
(3) 3 is an element of C. Find its image G(3). 

(4) 3 is an element of D. Find G°~!”({3}). 


3.7 Composite and Inverse Functions 
3.7.1 Compositions of Functions 


Some situations involve sequences of operations corresponding to sequences of 
functions. For instance, if a first function consists of pairs (J, X) with the ascension 
(elevation) X of a planet at time T, and if a second function consists of pairs (X, Y) 
with the declination (azimuth) Y of the planet at ascension X, then the composition 
of the two functions consists of pairs (JT, Y) with the declination Y of the planet at 
time T. 


3.105 Definition (Composition of functions). For all functions F : A — B and 

G: B > C, the composite function G 0 F (read “G preceded by F” or “F followed 

by G” or “the composition of G and F”’) is the function Go F : A — C defined by 
(Go F) (X) := G[F(X)] 

for each X € F°~!”[F(G)]. Thus, 


[(X,Z) € (Go F)] } {[AY(¥ € B)] A [(X. Y) € FA [(Y, Z) € G]}. 
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3.106 Example. Consider the following functions F and G: 


A = {0,1}, B = {0,1, 2}, 
B = {0,1, 2}, C = {0,1}, 
F:A—>B, G:B>C, 
F = {(0,0), (1, 2)}; G = {(0, 1), (1,9), (2, D}. 
Their composition 
(GoF):A>C, 
has values 
(Go F)(0) = G[F(0)] = G[0] = 1, 
(Go F)(1) = G[F()] = G2] = 1, 
so that 


(GoF) = {(0, 1), (1, D}. 


3.107 Theorem. The composition of functions is associative: For all functions 
F:A—>B,G:B—>C,andH: C—D, 


[H 0 (Go F)| = [(HoG) oF}. 
Proof. For each X € F°~!”{G°-!"[(H)]}, apply the definition of o repeatedly: 


[H 0 (Go F)|(X) = H{(Go F)(X)} 
= A{G[F(X)]} 
= [Ho GI[F(X)] 
= ([H 0G] oF) (X). 
oO 


In contrast to its associativity, the composition of functions is not commutative. 


3.108 Counterexample. Consider the following functions F and G: 


A = {0,1}, B = {0,1, 2}, 
B = {0,1, 2}, C = {0,1}, 
F:A—>B, G:B>C, 
F = {(0,0), (1, 2)}; G = {(0, 1), (1,9), (2, D}. 


Their composition 


(FoG):C >A, 
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has values 


(Fo G)(0) = F[G(O)] = F[I] = 2, 
(Fo G)(1) = F[G(1)] = FIO] = 0, 


so that 

(Fo G) = {(0, 2), (1, 0)}. 
In contrast, 

(GoF) = {(, 1), 0, D} 


from example 3.106, which confirms that (F o G) 4 (Go F). 


3.7.2 Injective, Surjective, Bijective, and Inverse Functions 


Such problems as the solution of equations involve the determination of whether 
an equation has no solution, exactly one solution, or more than one solution, which 
correspond to the features of functions introduced here. 


3.109 Definition (injectivity). A function F : A — B is injective if and only if for 
all W € A(F) and X € A(F), if W # X, then F(W) F F(X): 


VW[YX({IW € P(F)] A [X € FF) => {[W # X) = (F(X) # F(W))})]- 


By contraposition, the condition just stated is equivalent to the following alternative 
condition: for all W € A(F) and X € A(F), if F(W) = F(X), then W = X: 


VW[Vx({[W € A(F)] A [X € A(F)]} > {[F®) = F(W)] > (W = X)})]. 


The notation F : A <— B indicates that F is an injection. 


Another common usage consists of saying that F maps only one X to each one 
Y; yet this alternative terminology fails to indicate which “one” it emphasizes (the 
Jirst “one’”’), which leads to confusion, and, therefore, will not be used here. 


3.110 Example. With A := {0,1,2} and B := {0,1,2,3,4,5,6,7,8, 9}, the 
function G := {(0, 1), (1, 3), (2, 9)} is injective. 


3.111 Example. With A := {4, 6,8, 9} and B := {2,3}, the function 
H := {(4,2), (6,3), (8,2), (9,3)} 


is not injective, because 4 # 8 but H(4) = 2 = A(8). 
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3.112 Definition (surjectivity). A function F : A — B is surjective if and only if 
for each Y € B, there exists some X € A for which Y = F(X). In other words, the 
condition just stated means that the range of F consists of all of the co-domain B: 


VY{(Y € B) = (AX{(X € A) A [F(X) = Y}})}. 


The notation F : A —> B indicates that F is a surjection. 
Another common mathematical usage consists of saying that F maps A onto B. 


3.113 Example. With A := {0,1,2} and B := {0,1,2,3,4,5,6,7,8, 9}, the 
function G := {(0,1), (1,3), (2,9)} is not surjective: there is no X € A with 
G(X) = 6. 


3.114 Example. With A := {4, 6, 8, 9} and B := {2, 3}, the function 
H := {(4,2), (6, 3), (8, 2), (9, 3)} 


is surjective. Indeed, H(4) = 2 for Y := 2, and H(6) = 3 for Y := 3, 
3.115 Example. With A := {6, 7,8, 9} and B := {1, 2, 3, 4}, the function 


F := {(6, 2), (7, 1), (8, 2), (9, 3)} 


is neither injective, because F(6) = 2 = F(8) with 6 ¥ 8, nor surjective, because 
there does not exist any X € A with F(X) = 4. 


3.116 Definition. For all sets A, B, C, D, and for all functions F : A — B and 
G: C > D, define the function 
FRG:AxC->BxD, 
(W, Z) +> (F(W), G(Z)). 


Thus (cw, Z), (F(W), G(Z))) € F &G if and only if (W,F(W)) € F and 
(Z, G(Z))) € G, so that ((w, F(W)). (Z, G@)))) —eFxG. 

3.117 Theorem. Jf F : A — Band G: C = D are both injective, or both 
surjective, then F & G is injective, or surjective, respectively. 


Proof. lf F: A — Band G: C — Dare both surjective, then the function F Xl G 
is surjective: indeed, for each (U,V) € B x D, or, equivalently, for all U € B and 
V e€ D, there exist X € A and Y € C such that F(X) = U and G(Y) = V, by 
surjectivity of F and G. Hence 


(F & G)(X, Y) = (F(X), G(Y)) = (U,V). 


IfF: A— Band G: C — Dare both injective, then the function F'& G is injective: 
indeed, if (F KX G)(W, Z) = (F XG)(R, S), then 
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(F(W), G(Z)) = (F & G)(W, Z) = (F BG)(R, S) = (F(R), G(S)). 


Hence F(W) = F(R) and G(Z) = G(S), by equality of ordered pairs. Consequently, 
W = Rand Z = S, by injectivity of F and G. Oo 


3.118 Definition (Bijectivity). A function F : A > B is bijective if and only if F 
is both injective and surjective, which can be denoted by F: AZ BorF: A= B. 


3.119 Example. With A := {0,1,2,3} and B := {1, 2, 4, 8}, the function P: A > 
B defined by P := {(0, 1), (1, 2), (2, 4), (3, 8)} is bijective. 


3.120 Example. For each set A the identity function [4 : A — A,X +> X is bijective. 
Indeed, J, is injective, because if 4(W) = I4(X) then W = I,(W) = [4 (X) = X. 
Similarly, Z4 is surjective, because for each Y € A there exists X € A, in effect 
X := Y, with Y= 1,(Y). 


3.121 Example. For each proper subset S & A, the inclusion function: : S$ — A, 
X + X, is injective but not surjective. Indeed, 1 is injective, for if W,X € S and 
W # X, then (W) = W 4 X = 1(X), whence 1(W) 4 1(X). However, ¢ is not 
surjective: because S is a proper subset of A, there exists some element Z in A \ S; in 
particular, Z # X for each X € S, and, consequently, «(X) = X # Z, which means 
that ¢ is not surjective. Thus, ¢ is not bijective either. 


3.122 Example. For each nonempty set A and for each set B containing more than 
one element, the canonical projection P4 : (A x B) — A is surjective but not 
injective. Indeed, B contains at least one element Y € B, because B # @; hence, 
X = P,(X,Y) for each X € A. However, B also contains some Z € B such that 
Y # Z. Consequently, (X,Y) # (X,Z), and yet Pa(X, Y) = X = P(X, Z), which 
means that P, is not injective. Thus, P, is not bijective either. 


3.123 Theorem. For all functions F: A— BandG: B—>C, 


¢ if F and Gare both injective, then Go F is also injective; 

¢ if F and Gare both surjective, then G 0 F is also surjective; 
¢ if F and Gare both bijective, then Go F is also bijective; 

° ifGo F is injective, then F is injective; 

° ifGo F is surjective, then G is surjective; 

° ifGo F is bijective, then F is injective and G is surjective. 


Proof. Assume that F and G are both injective. For all distinct elements W # X in 
F°-!"[Y(G)], the injectivity of F ensures that F(W) # F(X). Hence G(F(W)) # 
G(F(X)) by the injectivity of G. Hence, (Go F)(W) = G(F(W)) # G(F(X)) = 
(G o F)(X), whence (Go F)(W) 4 (Go F)(X), so that G o F is injective. 

Assume that F and G are both surjective. For each Z € C, the surjectivity of 
G ensures the existence of an element Y € B such that G(Y) = Z. Hence, by the 
surjectivity of F, there exists an element X € A for which F(X) = Y. Therefore, 
(Go F)(X) = G(F(X)) = G(Y) = Z, which means that (G o F) is surjective. 

In particular, if F and G are both bijective, then G o F is also bijective. 
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Assume that Go F is injective. For all distinct elements W 4 X in F°-!”[(G)], 
the injectivity of Go F ensures that (Go F)(W) 4 (Go F)(X), whence G(F(W)) 4 
G(F(X)). Because G is a function, G cannot take different values at the same 
argument, whence it follows that F(W) 4 F(X), so that F is injective. 

Assume that Go F is surjective. For each element Z € C, the surjectivity of GoF 
ensures the existence of an element X € A such that (Go F)(X) = Z. Hence, letting 
Y := F(X) demonstrates the existence of an element Y € B for which G(Y) = 
G(F(X)) = (Go F)(X) = Z, which means that G is surjective. 

In particular, if G o F is bijective, then F is injective and G is surjective. oO 


3.124 Definition (invertibility). A function F : A — B is invertible if and only if 
there exists a function G: B —> A for which Go F = Ig) and Fo G = Ig@). Such 
a function G is denoted by F°! and called the inverse function of F. Thus, 


FoloF= QF); 
FoF? !=lg@. 


3.125 Example. With A := {0, 1,2,3} and B := {1,2, 4, 8}, the function 
F := {(0, 1), 0, 2), 2, 4), 3, 8)} 


is invertible, with inverse F°—! := {(1, 0), (2, 1), (4, 2), (8, 3)}. 


3.126 Theorem. Each function F : A — B admits at most one inverse function. 
Moreover, if G: B — A is an inverse function for F, then G consists of all pairs 
obtained by swapping the coordinates in each pair of F. 


Proof. Assume that G is an inverse function for F’, which means that Go F = Igir) 
and Fo G = Igq@), and consider the set 


H :={(Y,X): (X,Y) € Fh. 


If (X,Y) € F, then X € A(F). Because Go F = Iqp), it follows that there exists 
some Z € B such that (X, Z) € F and (Z, X) € G. With (X, Y) € F and (X,Z) € F, 
it follows that Y = Z, because F is a function. This shows that if (Y, X) € H, so that 
(X, Y) € F, then (Y, X) = (Z,X) € G, whence G C H. 

Conversely, If (Z, X) € G, then Z € Z(G). Because F 0 G = gw it follows that 
there exists some W € Y(F) such that (Z, W) € Gand (W, Z) € F. With (Z,X) € G 
and (Z, W) € G, it follows that W = X, because G is a function. This shows that if 
(Z, X) € G, then (X, Z) € F, whence (Z, X) € H, so that G C H. 

Finally, G = H, which shows that if F has an inverse function G, then the only 
possibility is G = H. Thus, G = H = F°"! = {(Y,X): (X,Y) € Fh. Oo 


3.127 Theorem. For each function F : A — B with Q(F) = A, the function F is 
invertible if and only if F is bijective. 
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Proof. Assume that F is invertible, with inverse G := F°~!. Because F 0 F°-! = 
IgG) 18 surjective, it follows from theorem 3.123 that F is surjective. Similarly, 
because FO"! o F = Ig Ff) is injective, it follows from theorem 3.123 that F' is 
injective. 

Conversely, assume that F is bijective. Construct an inverse function by means 
of the set G defined by swapping both coordinates in each pair of the function F: 


G:= {(Y,X): (X,Y) € F}. 


Then verify that G is the inverse function of F’. 

First, the injectivity of F ensures that G is a function: if (Y, X) € Gand (Y, W) € 
G, then (X, Y) € F and (W, Y) € F, whence X = W by injectivity. 

Second, Go F = Ig r). Indeed, if (X, Z) € (Go F), then there exists some Y € B 
for which (X,Y) € F and (Y,Z) € G. Consequently, (Z,Y) € F, and again the 
injectivity of F shows that X = Z, whence (X, Z) = (X,X). Thus, (Go F) C Ign). 
Because the foregoing reasoning holds for each X € A(F), however, it also follows 
that Ig(p) C (Go F), and thus (Go F) = Ign). 

Finally, F o G = Iga). Indeed, if (Y, W) € (F o G), then there exists some X € 
G(G) with (Y, X) € Gand (X, W) € F. From (Y, X) € G, it follows that (X, Y) € F. 
Because F is a function, it also follows that Y = W, whence (Y, W) = (Y, Y). Thus, 
(Fo G) C Ig). Then the surjectivity of F guarantees that foreach Y € Z(G) there 
exists some X € A with (X, Y) € F. Hence, (Y, X) € G and then (Y, Y) € (Fo G), 
so that Ip C (Fo G). Therefore, (F o G) = Ip. oO 


Some situations involve a concept more general than invertibility. 


3.128 Definition (left or right invertibility). A function F : A — B is invertible 
on its left, or left invertible, if and only if there exists a function G: B — A for 
which Go F = Iqr). Such a function G is called a left inverse function for F. 

Similarly, a function F : A — B is invertible on its right, or right invertible, 
if and only if there exists a function G: B — A for which F o G = Ig). Sucha 
function G is called a right inverse function for F. 


3.129 Example. With A := {0,1,2} and B := {0,1,2,3,4,5,6,7,8, 9}, the 
function F : A > B defined by F := {(0, 1), (1, 3), (2, 9)} has a left inverse function 


G := {(1, 0), (2, 0), (3, 1), (4, 0), (5, 0), (6, 0), (7, 0), (8, 0), (9, 2)}. 


3.130 Example. For A := {4, 6,8, 9} and B := {2, 3}, the function F : A > B with 
F := {(4, 2), (6, 3), (8, 2), (9, 3)} has a right inverse G := {(2, 4), (3, 6)} 


3.131 Example. Perspectives to draw a picture of space A on a flat screen B C A 
can be represented by a function F : A — B, mapping each point X in space A to its 
image F(X) on the screen B. For such perspectives, each point Y € B on the screen 
B is its own image, so that F(Y) = Y. Thus the inclusion function 13: B > Aisa 
right inverse for F’, because (F 0 tg)(Y) = F[tg(Y)| = F(Y) = Y for every Y € B, 
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so that F otg = Jpg. In contrast, such a perspective F has no left inverse, because F’ 
maps many points in space to the same image on the screen. 


The existence of a /eft-inverse G: B — A for a function F : A — B indicates 
that for each Y € B the equation F(X) = Y has at most one solution. In contrast, the 
existence of a right-inverse G: B — A for a function F : A — B indicates that for 
each Y € B the equation F(X) = Y has at least one solution. 


3.7.3 The Set of all Functions from a Set to a Set 


This subsection shows that all the functions from any fixed set into any fixed co- 
domain form a set. 
Theorem 3.132 shows that all the functions between two fixed sets form a set. 


3.132 Theorem. For all sets A and B, all the functions from any subset of A to B are 
the elements of a set, of which all such functions defined on all of A form a subset. 


Proof. By definition 3.66, the power set A(A x B) is the set of all the relations 
between A and B. By definition 3.81, every function Ff : A — B defined on any 
subset of A is also a relation, so that F € A(A x B). By the axiom of separation 
(page 124), with a formula stating that a relation F € A(A x B) is a function, all 
such functions form a subset of A(A x B), denoted here by F4-4z: 


Fas 
= {F € P(Ax B): VWVXVY|([(X Y) € FIA [(W,Y) € F]) > & = W)]}. 
For each subset D C A, again by the axiom of separation (page 124), with a formula 


stating that the domain of F is D, the functions F : D — B defined on all of D form 
a subset of ¥p_+z, denoted here by BP: 


B := {F € ¥p4g: VX[(X € D) => (4Y[(X. Y) € F))]}. 


The case D := A corresponds to the set B4 of all functions from all of A to B. Oo 
Definition 3.133 sets the notation from the proof of theorem 3.132. 


3.133 Definition. For all sets A and B, the set of all functions F : A — B defined 
on any subset of A is denoted by .F4-,. The set of all functions F : A > B defined 
on all of A is denoted by B4. 


3.134 Example. If A = @, thenA x B = @ x B = @ for every set B regardless of 
B. The only subset F := @ C @ = A x Bis a function from A to B with domain A. 
Hence if K = @ = 0, then BX = B° = {@} = {0}. Thus there exists a function of 
zero variable with zero value in every set B, and this function is the natural number 
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zero. This function of zero variable allows for a minimal set of starting functions in 
some contexts, for instance, with primitive recursive functions [109, p. 926]. 


3.135 Theorem. For all sets A and B, LU BP = Fauyp. 
DCA 


Proof. For each subset D C A, the set B? consists of all the functions F: D > B 
defined on all of D C A, whence B? C ¥,4_,,, or, equivalently, BP ¢ P(F4_.2) 
and hence {B? € P(.F4_5g): D € A(A)} is also a set. Consequently so is its union 
by the axiom of union (page 127): Upc, B? © Faz. Conversely, each function 
F € ¥,_+p is defined on all of its domain, which is a subset D C A, so that F € B?. 
Hence also Fas © Upc, B?. oO 


3.136 Theorem. For all sets A and B, for each set 9 © YA) of subsets of A, all 
the sets B? for every D € Qalso forma set ¥9,3 © P(Fa-sg). Hence the union 
U Fo.8 = Upeg B” is also a set. 


Proof. For each set Y C P(A) of subsets of A, all the sets B? for every D € D 
also form a subset of A(F4- 8), denoted here by ¥9,z: 


Fop:i= {Ee A|P(Ax B)]: AD[(D € J) A (E=B?)]}, 


where E = B? is an abbreviation of the formula resulting from the axiom of 
extensionality (page 111) and the preceding two applications of the axiom of 
separation (page 124). Hence the union LJ) F9,2 = Upneg B? is also a set, 


|_) Fon = {F € Fase: (DD) A(F € Bh}. 


by the axiom of union (page 127). Oo 


Theorem 3.137 reveals a bijection between the set C4*# of all functions defined 
on a Cartesian product A x B into a set C and the set (C?)4 of all functions from A 
into the set C? of all functions defined on B into C. 


3.137 Theorem. For all sets A, B, C, there is a bijection from C4*8 onto (cA. 


Proof. Define a function H : C4*8 — (C®)4 as follows. For each element F € 
C4*8 | which is a function F : A x B > C defined on all of A x B, define an element 
H(F) € (C%)4, which is a function H(F) : A — C® defined on all of A, so that for 
each X € A the image [H(F)](X) € C? is a function [H(F)](X) : B > C defined on 
all of B, defined for each Y € B by 


H: CAxB > (ey 
Fr H(F), 
TA(P)ICO}() := F(X, Y). 
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Similarly, define a function L: (C?)4 — C4*® as follows. For each element G € 
(C®)4, define an element L(G) € C’*8, which is a function L(G) : A x B > C, 
defined by [L(G)](X, Y) := [G(X)](X). 
L: icy > CAxB 
Gr L(G), 
[L(G)](X, Y) := [G(X)](Y). 


Then HoL = Icaya and Lo H = Icaxs, whence H and L are inverses of each other: 


[(Lo H)(F)\(X, Y) = {L[A(F)|}(X, Y) 
= (AP ICO}M) 
= F(X, ¥); 
TA eo L(GXIM) = tALOTOMW 
= [L(G)|X. Y) 
= [GX)|(%). 


Thus (Lo H)(F) = F and (H o L)(G) = G. Hence H and L are bijections. Oo 


3.7.4 Exercises on Injective, Surjective, and Inverse Functions 


3.151. For each F: A > Band Ig: B > B, prove that Jp o F = F. 
3.152. For each F: A > Band I, : A > A, prove that Fol, = F. 
3.153. For each F: A— Band @: @ — A, prove thatFoO=@. 
3.154. Prove that if a function has a left inverse, then it is injective. 
3.155 . Prove that if a function is injective, then it has a left inverse. 
3.156. Prove that if a function has a right inverse, then it is surjective. 


3.157 . Provide an example of a function with a right inverse but no left inverse. 
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3.158 . Provide a function with a left inverse but no right inverse. 
3.159 . Provide a function that has more than one left inverse. 


3.160. Provide a function that has more than one right inverse. 


3.8 Equivalence Relations 


3.8.1 Reflexive, Symmetric, Transitive, or Anti-Symmetric 
Relations 


Besides functions, mathematics contains several other types of relations. For 
instance, ordering relations define orders or rankings, whereas equivalence relations 
define equivalences relative to certain criteria. Such various types of relations can 
be defined by combinations of several features called reflexivity, symmetry, and 
transitivity. 


3.138 Definition (Reflexivity). For each set A, a relation R C A x A is reflexive if 
and only if (X,X) € R for each X € A: 
VX{(X € A) = [(X, X) € R]}. 


As a graph, a reflexive relation contains at least a single loop at each vertex: 


(XX) ee) (oe (Y,¥) 


3.139 Theorem. A relation Z C A x A is reflexive if and only if A, C &. 


Proof. Arelation Z& C A x A is reflexive if and only if (X, X) € & for every X € A, 
in other words, if and only if # contains the set {(X,X): X € A} = Ag. Oo 


3.140 Example. For each set A, the diagonal Ay is a reflexive relation. 


3.141 Example. For each set A and its power set A(A), the relation C on A(A) is 
reflexive. Indeed, B C B for each B € ALA). 


3.142 Counterexample. The relation of membership € is not reflexive. For 
instance, if A = {@}, then @ € A, but @ ¢ @. Thus € is not reflexive on 
AXA. 


3.143 Definition (symmetry). For each set A, a relation R C A x A is symmetric 
if and only if (X, Y) € R is equivalent to (Y, X) € R for all X ¢ Aand Y € A: 


VXVY([(X, Y) € R] & [(Y,X) € R]). 
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As a graph, a symmetric relation contains edges in either both directions or neither: 


(X,Y) 


on © 


(YX) 
3.144 Example. For each set A, the diagonal A, is a symmetric relation. 


3.145 Counterexample. The relation of membership € is not symmetric in gen- 
eral: if A := { @, {@} i then @ € {@} but {@} ¢ @; thus € is not symmetric. 


3.146 Definition (transitivity). For each set A, a relation R C A x A is transitive 
if and only if (X, Y) € Rand (Y, Z) € Rimply (X, Z) € R forall X,Y,Z € A: 


YXVYVZ({[(X, Y) € R] A [(Y,Z) € R]} = [(X, Z) € R]) 


As a graph, a transitive relation completes two consecutive edges into a triangle: 


(X,Y) 


@ 


3.147 Example. For each set A, the diagonal Ay is a transitive relation. 


3.148 Example. For each set A, the relation C on (A) is transitive: for all U € 
P(A), Ve YA(A), We YA), if U C Vand V C W, then U C W. 


3.149 Counterexample. The relation of membership € is not transitive in general: 
ifA = { B, {SO}, {{o}} \, then @ € {O} and {O} € {{O}}, but S ¢ {{o}}. 


3.8.2 Partitions and Equivalence Relations 


The concept of an equivalence relation on a set corresponds to a “partition” of that 
set into a union of disjoint subsets called “equivalence” classes. 


3.150 Definition (equivalence relations). For each set A, a relation R C A x A is 
an equivalence relation if and only if R is reflexive, symmetric, and transitive. 
3.151 Example. For each set A, the diagonal A, is an equivalence relation. 

3.152 Example. The set A := {0, 1,2, 3} admits an equivalence relation 


R := {(0,0), (1, 1), 2,2), 3,3). (0,2), (2,0), (1.3), 3, D} 
= A, U {(0, 2), (2,0), (1,3), 3, Df. 
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¢ The relation & is reflexive because it contains the diagonal 
Aa = {(0,0), (1, 1), (2, 2), (, 3)}. 


¢ The relation & is symmetric: it contains (0, 2), (2,0), as well as (1, 3), (3, 1). 

¢ The relation # is transitive because if it contains (X,Y) and (Y,Z), then it 
contains (X, Z), for instance, (0, 2) and (2, 0) hence (0, 0); (2, 0) and (0, 2) hence 
(2,2), as well as (1, 3) and (3, 1) hence (1, 1); (3, 1) and (1, 3) hence (3, 3). 


3.153 Counterexample. The relation of membership € is not an equivalence 
relation in general, because it is not symmetric and not transitive. 


3.154 Definition (partition). A partition of a set A is a set # C A(A) of subsets 
of A with all of the following properties. 


* No member of ¥ is empty: VV{(V € F) > (V # @)}. 
* The union of all the members of .-¥ “covers” A, which means that A = (J -¥). 
¢ All pairs of distinct members of ¥ are disjoint: 


W(VWi[(V € F) (We F)A(V AW) > [(VNW) = O}}). 


3.155 Example. The empty set @ admits only one partition: the empty set F = ©. 
3.156 Example. Each nonempty set S has a partition: the singleton .F = {S}. 


3.157 Example. The set A := {0,1,2,3} admits a partition .Y into two disjoint 
sets, F := { {0, 2}, {1, 3} i 


¢ No member of ¥ is empty: {0,2} # @ and {1,3} 4 ©. 
¢ The unions of the members of ¥ covers A, so that {0, 1, 2,3} = {0,2} U {1, 3}. 
¢ Distinct members of -¥ are disjoint: {0,2} N {1,3} = ©. 


3.158 Definition (relations from partitions). For each partition # C P(A) of 
each set A, define a relation Rg on A so that R relates elements X € A and Y € Aif 
and only if X and Y belong to the same element of the partition ¥: 


[(X,Y) € Re] > {AB[(B Ee F)A (XE B)A (VY € B)]}. 
3.159 Theorem (equivalence relations from partitions). For each set A and for 


each partition F of A, the relation Rg is an equivalence relation. 


Proof. The relation Rz is reflexive: for each X € A, the partition Y has an element 
B € F that contains X, because ¥ covers A. Thus X and X both belong to the same 
Be F#, whence (X,X) € Rg. Symbolically, 
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VX{(X € A) > [(X,X) € Rg]} yet unproved, 
{= definition 3.158 for Rz, 
YWX({X € A} > 
{AB[(B € F) A (X € B) A (X € B)]}) 
{tautology [(P) A (P)] > (P), 
WX({X € A} > 
{AB[(B € F) A (X € B)]}) 
{definition of LU. 
AC(UF) 
which is universally valid by definition of a partition. 
The relation Rg is symmetric: (X, Y) € Rz if and only if the partition F has an 
element B € ¥ that contains X and Y, whence B contains Y and X, which means 
that (Y,X) € Rz. Symbolically, from the tautology [(P) A (Q)] + [(Q) A (P)], 


L WXVY{[(X EA)A(YEA) > 
({SB[(B € F) A(X EB)A(Y EB} o> 
(BIB e F)A(YEB)A(XE B)}}})} 
¢ definition 3.158 for Rz. 
WX[VY([(X € A) A (Y € A)] > 
{[X%, ¥) € Ra] @ [(Y,X) € R]})] 
The relation Rz is transitive: if (X,Y) € Rg, then the partition has an element 
B € ¥ that contains X and Y. If also (Y,Z) € Rz, then the partition ¥ has an 
element C € ¥ that contains Y and Z. However, from X € B and Y € C it follows 


that X € (BM C), whence B and C are not disjoint and hence B = C, by definition 
of a partition. Consequently, X € B and Z € B, and, therefore, (X,Z) € Rg. Oo 


3.160 Definition (equivalence classes). For each equivalence relation R C (AXA), 
and for each element X € A, define the subset [X]r of all the elements Y € A 
equivalent to X with respect to R, called the equivalence class of X: 


[X]r := {Y EA: (X,Y) € R}. 
Then let “pr consist of all such equivalence classes: 
Fp = {[X]r E P(A) {XE A}. 


Another common notation for Fr is A/R, so that Fp = A/R = {[X]r: X € A}. 
The set Fr = A/R of all equivalence classes is also called the quotient of the set A 
by the relation R. 


3.161 Example. For each set A, the equivalence classes of the “diagonal” equiva- 
lence relation A, consist of every singleton [X],, = {X} for every X © A. Thus 
A/Ag = {{X}: Xe A}. 


www.pdfgrip.com 


178 3 Set Theory: Proofs by Detachment, Contraposition, and Contradiction 


3.162 Example. For the set A := {0, 1,2, 3}, the equivalence relation 


& := {(0,0), 0, 1), (2, 2), (3, 3), (0, 2), (2,0), (1, 3), 3, Dt 
= A, U {(0,2), (2,0), (1,3), 3, D} 


corresponds to the equivalence classes 


[O]z = {0, 2}, 
[Ja = {1, 3}. 


Thus A/Z = {[0]z, [1]@}. 


3.163 Theorem (partitions from equivalence relations). For each equivalence 
relation R C (A x A), the set of subsets Fr © P(A) is a partition of A. 


Proof. The partition #r covers A: the reflexivity of R guarantees that (X,X) € R 
for each X € A, whence X € [X]k, and hence X € (U7, Bz) = (U Fr). Thus 
A C (LF). The reverse inclusion follows from (J -¥) © [L) A(A)] = A: 


F VWX{(X € A) => [(X,X) € R]} _ reflexivity of R, 
¢ definition of [X]p, 
VX{(X € A) => (X € [X]r)} 
¢ definition of Fp, 
WX{(X € A) => [([X]x © Fr) A (X € [X]r)]} 
¢ definition of J, 
VX[(X € A) => {AB[(B € Fr) A (X € B)]}] 
¢ definition of L), 
WX{(X €A) > Ke (UA) 
¢ = definition of C. 
AC(UF)!] 


Any two distinct elements of Fr are disjoint: if two members B and C of Fp are 
not disjoint, then their intersection B M C contains an element X € A; by definition 
of Fr, however, every element of B is equivalent to X, and so is every element of C, 
whence B = [X]r = C, which is a negation of the distinctness of B and C. Finally, 
Fp does not contain any empty element. Indeed, if A = @, then R C A x A, whence 
R = @ and Fr = @, which does not contain any element, and hence no empty 
element. If A # @, then Fp = {[X]x: X € A} where X € [X]r, whence [X]rx # ©. 
oO 


3.164 Example. For each equivalence relation R C A x A on a set A, the canonical 
map, also called the quotient map, is the function P: A — A/R that maps each 
element X € A to its equivalence class [X]Jr = {Y € A: (X,Y) € R}: 


P:A > A/R, 
XP [X]r. 
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3.8.3 Exercises on Equivalence Relations 


3.161. Prove that the empty relation @ C A x A is reflexive, symmetric, and 
transitive. 


3.162. Prove that for each set A the relation of strict inclusion S on A(A) is not 
reflexive. 


3.163 . Prove that for each set A the relation of strict inclusion S on #(A) is anti- 
symmetric. 


on A(A) is 


3.164. Prove that for each set A the relation of strict inclusion S 
transitive. 


3.165. For the set A := {0,1,2,3,4, 5}, verify that the following relation # is an 
equivalence relation, and list all its equivalence classes: 


(1,5) (3,5) (5,5) 
(0, 4) (2,4) (4, 4) 

(1, 3) (3, 3) (5, 3) 
(0, 2) (2, 2) (4, 2) 

(1, 1) (3, 1) (5, 1) 
(0, 0) (2,0) (4, 0) 


i 


3.166. For the set A := {0, 1,2,3,4, 5}, verify that the following relation .Y is an 
equivalence relation, and list all its equivalence classes: 


(5, 2) (5,5) 
(1,4) (4, 4) 
(0, 3) (3,3) 
(2,2) (2,5) 
(1, 1) (4, 1) 
(0, 0) (3, 0) 


i 


3.167. For the set B := {0,1,2,3,4,5, 6,7}, verify that the following set F of 
subsets is a partition, and list the corresponding equivalence relation: 


F := { {0,2, 4, 6}, {1,3,5, 7} }. 


3.168. For the set B := {0,1,2,3,4,5, 6,7}, verify that the following set ¥ of 
subsets is a partition, and list the corresponding equivalence relation: 


GY := { {0,4}, {1,5}, {2, 6}, {3, 7} }. 


3.169. Prove that R(z,) = R for each equivalence relation R. 


3.170. Prove that ¥(r,) = ¥ for each partition ¥. 
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3.9 Ordering Relations 


3.9.1 Preorders and Partial Orders 


Besides the reflexivity, symmetry, and transitivity used to define equivalence 
relations, such other types of relations as rankings also require different variations 
of these concepts, as introduced here (with the terminology of Suppes [128, §3.2, 
p. 69]). 


3.165 Definition (strict, irreflexivity). For each set A, a relation R C A x A is 
irreflexive, or, equivalently, strict, if and only if R does not relate any element of A 
to itself: 


VX{(X € A) => [(X,X) ¢ R]}. 


3.166 Example. The empty relation @ is strict, because the conclusion (X,X) € @ 
is universally valid. 


3.167 Example. For each set A, the relation of strict inclusion C is strict on A(A). 
Indeed, for all subsets V C A and W C A, the definition of V C W includes the 
requirement that V # W, so that (V C W) => (V 4 W). Contraposition then 
confirms that (V = W) > (V ¢ W), so that V Z V. 


One method to specify a ranking or direction on a set removes the requirement 
of symmetry from the concept of equivalence, which gives the following concept of 
preorder. 


3.168 Definition (preorder or quasi-order). For each set A, arelationR CA XA 
is a preorder or a quasi-order, if and only if R is reflexive and transitive. It is a 
strict preorder if and only if R is irreflexive and transitive. 


3.169 Example. Consider the set A := {0, 1, 2}. 
The relation 2 := {(0, 0), (0, 1), (1, 0), (1, 1), (2, 2)} is a preorder. 
The relation Z& := {(0, 0), (0, 1), 1, 1), (2, 2)} is a preorder. 
The relation .Y := {(0, 1)} is a strict preorder. 


3.170 Example. For each set A, the diagonal A, is a preorder. If A # @, then Ay is 
not strict, because there exists X € A and then (X, X) € Ag. 


3.171 Example. For each set A the relation C is a preorder on A(A). The relation 
C is not strict because @ € W(A) and @ C @. 


3.172 Example. For each set A, the relation C is a strict preorder on A(A). 


The concept of a preorder R allows for “circular” rankings, with (X, Y) € R and 
(Y,X) € Reven though X # Y. To specify different types of rankings, a different 
concept — anti-symmetry — becomes necessary. 
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3.173 Definition (anti-Symmetry). For each set A, a relation R C A x A is anti- 
symmetric if and only if (X, Y) €¢ Rand (Y,X) € Rimply X = Y: 


VX(VY[{[(X, Y) € RJA [(Y,X) € R]}} > (X= Y))). 


3.174 Example. For each set A, the diagonal A, is an anti-symmetric relation. 


3.175 Example. For each set A, the relation C on A (A) is anti-symmetric. Indeed, 
for each B € (A) and for each C € A(A), if BC Cand C C B, then B= C. 


Similar features can also be defined through the following concept of asymmetry. 


3.176 Definition (asymmetry). For each set A, arelation R C AXA is asymmetric 
if and only if (X, Y) € R implies (Y, X) ¢ R for each X € A and each Y € A: 


VX(VY{[(X, Y) € R] = [(Y,X) ¢ R}}). 


3.177 Example. Consider the set A := {0, 1, 2}. 

The relation 2 := {(0,0), (0,1), (1,0), (1, 1), (2,2)} is not asymmetric: 2 
contains (0, 1) and (1,0). 

The relation & := {(0,0), (0, 1), (1, 1), (2,2)} is not asymmetric: # contains 
(0,0), (1, 1), and (2, 2). 

The relation .Y := {(0, 1)} is asymmetric. 


3.178 Example. The following relation is not asymmetric, not anti-symmetric, not 
irreflexive, not reflexive, not symmetric, and not transitive: 


(X,Y) 
[-O@BO==G 
(¥,X) 


In contrast to preorders, “partial orders” do not allow for circular rankings. 


3.179 Definition (partial order). For each set A, a relation R C A x A is a partial 
order if and only if R is reflexive, anti-symmetric, and transitive. It is a strict partial 
order if and only if R is irreflexive and transitive. 


3.180 Example. Consider the set A := {0, 1, 2}. 

The relation 2 := {(0,0), (0,1), (1,0), 1, 1), (2,2)} is not a partial order, 
because it is not anti-symmetric: 2 contains (0, 1) and (1,0) but 0 ¥ 1. 

The relation Z := {(0, 0), (0, 1), (1, 1), (2, 2)} is a partial order. 

The relation .Y := {(0, 1)} is a strict partial order. 


3.181 Example. For each set A, the diagonal Ay, is a partial order. 
3.182 Example. For each set A the relation C is a partial order on (A). 


3.183 Example. For each set A the relation ¢ is a strict partial order on (A). 
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As their names might suggest, neither preorders nor partial orders need relate all 
elements to one another. Relations that do so are called strongly connected. 


3.184 Definition (Strong connectivity). For each set A, a relation R C AxA 
is strongly connected if and only if (X,Y) € R or (Y,X) € R (including both 
possibilities) for each X € A and for each Y € A: 


VX[VY([(X € A) A (Y € A)] => {[(X, Y) € R] v [(Y,X) € R]})]. 


Strict rankings replace strong connectivity by connectivity. 


3.185 Definition (connectivity). For each set A, a relation R C A x A is connected 
if and only if (X, Y) € Ror (Y,X) € R (including both possibilities) for all distinct 
X € Aand Y € A (such that X # Y): 


VX[VY([(X € A) A (Y € A) A (X # Y)] > {[(X, Y) € R] v [(Y, X) € R}})]. 


3.186 Example. Consider the set A := {0, 1, 2}. 

The relation 2 := {(0,0), (0,1), 1,0), (1, 1), (2, 2)} is neither connected nor 
strongly connected, because it contains neither (0, 2) nor (2, 0). 

The relation 7 :={(0,0), (0,1), (0,2),(1,1),(1,2),(2,2)} is strongly 
connected. 


3.9.2 Total Orders and Well-Orderings 


The geometric “direction along a line” corresponds to a total order, also called 
linear order, complete order [71, p. 14], or simple order [71, p. 14], [128, p. 69]. 


3.187 Definition (total order). For each set A, arelation R C AxA is a total order, 
or total ordering, if and only if R is a strongly connected partial order (strongly 
connected, reflexive, anti-symmetric, and transitive). It is a strict total order if and 
only if R is connected, irreflexive, and transitive. 


3.158 Example. The empty relation on the empty set is a strict total order. 


3.189 Example. For each set S and the singleton A = {S}, the relation C on 
P(A) = {G, {S}}, is a total order. Indeed, @ C B, @ C {S}, and {S} C {S}. 


3.190 Counterexample. The relation C is not a total order in general. Thus, if 
A:={@, {o}}, 


then C is not a total order on the power set 


P(A) = | B, {2}, {123}. (2, ay}, 


www.pdfgrip.com 


3.9 Ordering Relations 183 


because 


{2} ¢ { {oh}, 
{{2}} ¢ {2}. 


Thus the relation C contains neither the pair ( {2}, { {@} \ ) nor the pair 
( { {@} }, {@} ). Instead, for this set A the relation C takes the following form: 


{IB}} CLS, {BH} 


G@ Cc {@} 


Some relations that are not total orders can restrict to total orders on subsets. 


3.191 Definition (chain). For each set A, and for each relation R C A x A, a subset 
B C Aisa chain if and only if the restriction R|g is a total order on B. 


In particular, for each total order R on each set A, the set A is a chain relative to R. 

In partially ordered sets, a subset might have a first element, which precedes 
every element of that subset, or a last element, which follows every element of that 
subset. 


3.192 Definition (first or last element). For each set A, for each subset B C A, 
and for each partial order R C A x A, an element X € B is a minimum, or first, or 
smallest, element in B if and only if (X, Y) € R for each Y € B. A first element of 
B is also denoted by min(B). Similarly, an element Z € B is a maximum, or last, 
or largest, element in B if and only if (Y,Z) € R for each Y € B. A last element of 
B is also denoted by max(B). 


3.193 Example. For each set A and for the relation C on A(A), the element @ € 
P(A) is a first element of A(A). Indeed, @ C C for each C € P(A). Also, the 
element A € A(A) is a last element of A(A). Indeed, C C A for each C € P(A). 


3.194 Definition (well-ordering). For each set A and for each partial order R C 
AXA, the set A is well-ordered by R if and only if each nonempty subset B C A has 
a first element. A relation R is called a well-ordering (for the lack of a grammatical 
and logically equivalent terminology) if and only if A is well-ordered by R. 


3.195 Remark. Instead of requiring that a well-ordering be a partial order [30, 
p. 31], some texts impose the stronger requirement that a well-ordering be a total 
order [71, p. 29]. Yet the insistence on a total order is redundant; indeed, if each 
nonempty subset has a first element, then a partial order is automatically a total 
order [128, p. 74-76]. 
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3.196 Example. The set {@} = YA(@) is well-ordered by the relation of inclusion 
C. Indeed, the only nonempty subset of {@} = A(@) is {M}, and it has a first 
element, namely @, because @ C C for each C € {@}. 


3.197 Theorem. Every subset of a well-ordered set is well-ordered. 


Proof. Each nonempty subset E C B of a subset B C A of a set A with a well-order 
~< is also a subset E C A and hence has a first element. Thus < induces a well-order 
on B. oO 


3.198 Definition (upper or lower bound). For each set A, for each subset B C A, 
and for each partial order R C A x A, an element Z € A is an upper bound for B 
if and only if (Y,Z) € R for each Y € B. Similarly an element X ¢€ A is a lower 
bound for B if and only if (X, Y) € R for each Y € B. 


Thus a last element can differ from an upper bound because an upper bound 
need not belong to the same subset. Similarly a first element can differ from a lower 
bound because a lower bound need not belong to the same subset. 


3.199 Definition (maximal element). For each set A, for each subset B C A, and 
for each partial order R C A x A, an element Z € B is a maximal element of B if 
and only if [(Z, Y) € R] = [(Y,Z) € R] for each Y € B. In other words, Z € B 
is a maximal element of B if and only if every element Y € B that follows Z also 
precedes Z (which allows for the possibility that no Y € B follows Z). 

Similarly, an element Z € B is a minimal element of B if and only if [(Y, Z) € 
R] => [(Z, Y) € R] for each Y € B. 


Thus, either a maximal element precedes no element, or it follows every element 
that it precedes, but in either case it belongs to the same subset. For instance, a 
last element is a maximal element. However, a maximal element need not be a last 
element. 


3.200 Example. Consider the set A := {0, 1, 2} and the relation 
2 := {(0,0), (0, 1), 7, 0), C1, 1), (2, 2)}. 


The element Z := 2 is maximal in A, because the hypothesis (Z, Y) € Qin [(Z, Y) € 
2) => [(Y,Z) € Q] is False for each Y # 2 in A. Yet 2 is not a last element in A, 
because 2 does not follow every element: (0,2) ¢ 2 and (1,2) ¢ 2. 


Such notions as the geometric direction on an “infinite line, plane, or space” 
require another axiom, the axiom of infinity, which forms the subject of chapter 4. 


Axiom S7 (Axiom of infinity) There exists a set I, such that @ € I, and for each 
element C ¢€ its successor C U {C} is also an element of I: 


b aIf(@ el A(vC{(Cel) = [(Cu {c}) € I})]. 


The concept of well-ordered sets allows for many logically equivalent statements 
of the last axiom of Zermelo-Frenkel set theory [30, Ch. 1, §9, p. 23, #9.2(3)]. 
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Axiom S88 (Axiom of choice) For each set .¥ of nonempty sets (A # @ for each 
A € F), there is a “choice” function F: ¥ — |) ¥ with F(A) € A for each 
Ac F., 

The axiom of choice is logically equivalent to each of the following two 
theorems, as proved in chapter 6. 


3.201 Theorem (Zermelo’s Theorem). Every set is well-ordered by some 
relation. 


3.202 Theorem (Zorn’s Lemma). In a pre-ordered set A, if every chain in A has 
an upper bound, then A has a maximal element. 


3.9.3 Exercises on Ordering Relations 


For the following exercises, determine all of the characteristics — among connec- 
tivity, strong connectivity, reflexivity, irreflexivity, symmetry, anti-symmetry, asym- 
metry, transitivity — of the given relation on the set A := {0, 1,2, 3, 4,5, 6, 7, 8, 9}. 


3.171. 


(0, 9) (1, 9) (3, 9) (6, 9) 
(0, 8) (1, 8) (2, 8) (4,8) (6,8) 
(0,7) (1, 7) 

(0, 6) (1, 6) (2, 6) (3, 6) (4, 6) 

(0,5) (1, 5) 

(0, 4) (1, 4) (2,4) 

(0,3) (1, 3) 

(0,2) (1, 2) 

(0, 1) 


3.172. 


(0, 9) (1, 9) (3, 9) (6, 9) (9, 9) 
(0,8) (1, 8) (2,8) (4, 8) (6, 8) (8, 8) 
(0,7) (1,7) (7,7) 

(0, 6) (1, 6) (2, 6) (3, 6) (4, 6) (6, 6) 

(0,5) (1,5) (5,5) 

(0, 4) (1, 4) (2, 4) (4, 4) 

(0,3) (1, 3) (3, 3) 

(0,2) (1,2) (2, 2) 

(0, 1) (1, 1) 

(0, 0) 
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3.173 . 
(0,9) (1, 9) (3,9) (6, 9) 
(0,8) (1, 8) (2,8) (4, 8) (6, 8) 
(0,7) (1,7) 
(0, 6) (1, 6) (2, 6) (3, 6) (4, 6) (8, 6) (9, 6) 
Gas) 
; (0, 4) (1, 4) (2,4) (6, 4) (8, 4) 
(0,3) (1,3) (6, 3) (9, 3) 
(0,2) (1, 2) (4, 2) (6, 2) (8, 2) 
(0, 1) (2,1) (3,1) (4,1) (5,1) (6,1) (7,1) (8,1) 9, D 
(1,0) (2,0) (3,0) (4,0) (5,0) (6,0) (7,0) (8,0) (9, 0) 
3.174. 
(0,9) (1, 9) (3,9) (6, 9) (9, 9) 
(0,8) (1, 8) (2,8) (4, 8) (6, 8) (8, 8) 
(0,7) (1,7) (7,7) 
(0, 6) (1, 6) (2, 6) (3, 6) (4, 6) (6, 6) (8, 6) (9, 6) 
yz} 05) 5) (5,5) 
; (0,4) (1, 4) (2, 4) (4, 4) (6, 4) (8, 4) 
(0, 3) (1, 3) (3, 3) (6, 3) (9, 3) 
(0,2) (1, 2) (2, 2) (4, 2) (6, 2) (8, 2) 
(0,1) (1,1) (2,1) 3,1) (4,1 65,1) (6,1) (7, 1) (8, 12) (9, 1D) 
(0,0) (1,0) (2,0) (3,0) (4,0) (5,0) (6,0) (7,0) (8,0) (9, 0) 
3.175. 
(1, 9) (9, 9) 
(1,8) (8, 8) 
(1,7) (7,7) 
(1, 6) (6, 6) 
_ (1,5) (5,5) 
a= (1,4) (4,4) 
(1, 3) (3, 3) (9, 3) 
(1, 2) (2,2) (4, 2) (8, 2) 
(0,1) Gd, 1) 


(1,0) 
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3.176. 


(1,9) (3,9) (2,9) 
(1,8) (2,8) (8, 8) 
(1,7) (7,7) 
(1, 6) (6, 6) 
(1,5) (5,5) 
(1, 4) (2,4) (4, 4) 
(1,3) (3,3) (9, 3) 
(1, 2) (2,2) (4, 2) (8, 2) 
(0,1) 0,) 216.) 4) 6.) 61) 71) B.D 9.) 
(1,0) 
3.177. Prove that for each set A the empty relation @ C A x A is a partial order. 
3.178. Prove that a relation Z is symmetric if and only if 2 = Z°"'. 
3.179. Prove that a relation Z on A is irreflexive if and only if @N A, = @. 


3.180. Prove that a relation & on a set A is anti-symmetric if and only if 2M 
Ro CAs. 


3.181. Prove that every strict partial order is asymmetric 


3.182 . Prove that if a relation is irreflexive and anti-symmetric, then it is also 
asymmetric 


3.183 . Prove that every asymmetric relation is also anti-symmetric. 
3.184. Prove that every asymmetric relation is also irreflexive. 


3.185 . Prove that a relation is a strict partial order if and only if it is asymmetric 
and transitive 


3.186 . Provide an example of an anti-symmetric but not asymmetric relation. 
3.187 . Provide an example of an asymmetric and strongly connected relation. 


3.188 . Exhibit an example of a set A for which the relation C on A(A) is not a 
total order: supply subsets B and C such that neither B C C nor C C B. 


3.189 . Exhibit an example of a set A and a subset S C A(A) that is a chain with 
respect to the relation C on A(A). 


3.190. Prove that if a function is surjective, then it has a right inverse. 


www.pdfgrip.com 


Chapter 4 
Mathematical Induction: Definitions and Proofs 
by Induction 


4.1 Introduction 


This chapter introduces the concepts of mathematical induction and recursion. 
Starting from first-order logic and the Zermelo-Fraenkel axioms of set theory, 
including the axiom of infinity, the chapter establishes the theoretical basis for 
proofs by the Principle of Mathematical Induction, followed by definitions through 
mathematical induction, which forms the basis for the concept of primitive- 
recursive functions. As an extended example, one section defines integer addition 
and multiplication as primitive-recursive functions, and hence derives the basic 
properties of integer arithmetic (associativity, commutativity, and distributivity) by 
induction. Subsequent sections define ordering relations and derive cancellation 
laws and exponential laws for integer and rational arithmetic. Another section 
then applies arithmetic to the cardinalities of unions, intersections, differences, and 
Cartesian products of finite sets. Yet another section focuses on denumerable and 
other not-necessarily finite sets, leading to J. M. Whitaker’s proof of the Bernstein- 
Cantor-Schréder Theorem within Zermelo-Fraenkel set theory. The prerequisites 
for this chapter consist of a working knowledge of first-order logic and Zermelo- 
Fraenkel set theory, for instance, as described in chapters 1, 2, and 3, which contain 
all the logical and set-theoretical theorems cited in this chapter. 

The concepts of “numbers” and “counting” allow for the determination and 
comparison of the “sizes” of various objects. The same concepts also lead to precise 
definitions of “infinite” sets, which then reveals an “infinite” variety of “infinite” 
sets. For example, the number of particles in the universe is related to its mass, 
volume, and density, which quantities might affect the fate of the universe, in 
particular, whether the universe will remain finite or will expand to “infinity” [69, 
p. 445]. 

Mathematically, this chapter shows how the concepts of “numbers,” “arithmetic,” 
“counting,” “computing,” and “infinity” fit within set theory, without any new 
axiomatic system. 
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4.2 Mathematical Induction 


The principle of “mathematical induction” provides a method for proving theorems, 
and also a method for specifying and analyzing many practical numerical calcula- 
tions. 


4.2.1 The Axiom of Infinity 


The Principle of Mathematical Induction forms the theoretical basis for such 
algorithms as counting and arithmetic, by means of sequences of definitions, 
computations, and verifications, where the length of the sequence depends on the 
situation. One framework to allow for sequences of yet unspecified lengths consists 
in embedding all such sequences into one set N that already allows for sequences 
of all lengths. To this end, with a construction attributed to John von Neumann [8, 
p. 22], [128, p. 129], [135], for each element X € N, the set N also contains a “next” 
element X U {X}. 


4.1 Definition. For each set X the successor of X is X U {X}. 
4.2 Example. The successor of the empty set @ is the set @ U {@} = {@}. 


A subsequent theorem will verify that for each set X € N the successor X U {X} 
is strictly larger than X, so that X & (X U {X}). Within the theory presented so 
far, however, nothing guarantees the existence of a set containing the successor of 
each of its elements. To this end, a new axiom becomes necessary [8, p. 21]. The 
following version is identical to that of axiom S7 already mentioned in chapter 3. 


Axiom S7 (Axiom of infinity) There exists a set I, such that @ € I, and for each 
element C € I its successor C U {C} is also an element of I: 


b aIf(@ el) aA(vc{(cel = [(Cu {c}) € I})]. 


Digits or other symbols can abbreviate the set notation for the elements of I. 


4.3 Example. The set I described in axiom S7 contains the following elements. 


0:= ©, 


II 


0U {0} = BU {o} = {a}, 


) 
ll 


1U {1} = {@} U {Ko}! = {o, {a}, 
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3:=2U {2} = {2, (a}} U {{2, {a3}! = (2, {0}, {2, {a} 


4:= 3U%3}= DB, {DB}, {S, {OH}, (2, {2}, {2,123} \ 


Such elements are called “natural numbers” (see remark 4.7 about 0). 


The axiom of infinity does not restrict I to contain only the elements in 
example 4.3, so that the set I might also contain other elements. Therefore, a further 
construction with the preceding axioms becomes necessary to form a set containing 
only the natural numbers. The construction selected here forms the intersection of 
subsets of I, using a method common to several parts of mathematics [30, p. 66], 
[34, p. 21], [60, p. 132]. Specifically, denote by P the logical formula defined by 


(P) + [(@ € A) A (VX {(X € A) = [(X U {X}) € A]})]. 


Thus the formula P asserts that a set A contains among its elements the empty set 
and the successor of every one of its elements. The Axiom of Infinity asserts that 
there exists a set I for which Subf; (P) is True: 


F aI[(@ EDA (VX {(X ED) = [KU {X}) € I})]. 


The following construction forms a subset N C I containing only elements such as 
those in example 4.3. To this end, consider the set ¥ C A(I) of all subsets B C I 
for which Subf; (P) is also True: 


F := {Be PI): Subfi(P)} 


= \B € Al): 
[(@ € B) A (VX {(X € B) = [(XU {x} € Bh) ]}. 


The following definition and theorems will confirm that N can be defined as () F. 
4.4 Theorem. The set ¥ = {B € A(1) : Subfg(P)} is not empty: le F. 


Proof. By the Axiom of Infinity (S7), the formula P is True for I, so that Subf} (P) 
is True. Moreover, I C I. Therefore, I is a subset of I for which P is True, so that 
(I <I) A [Subf?(P)] holds, which means that I € .F. Oo 


From I the following definition extracts N as the smallest subset for which P 
holds. 
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4.5 Definition (natural numbers). The set of natural numbers, denoted by N, is 
the intersection of all the elements of .¥: 


N:= () PF 
A natural number is an element of N. Also, define N* := N \ {@}. 


4.6 Remark. In the present theory, every natural number is a set. 


4.7 Remark. Some texts exclude the element @ from the set N. The two definitions 
differ only in their terminology and lead to the same theory. Yet the definition of N 
with @ € N has proved more convenient than without it in the context of set theory 
[128, p. 121], ordinals [74], and in such situations as Kurt Gédel’s work [46] on 
logic and mathematics. Therefore, the definition adopted here includes @ € N. 


The following theorems verify that P also holds for N. 
4.8 Theorem. The set N = (| is not empty; indeed, @ €N. 
Proof. First, @ € landI € # by theorem 4.4, whence @ € |_) #. Second, for each 


B ¢ F the formula Subf;(P) holds, hence @ € B; consequently, @ « ().F =N. 
oO 


4.9 Theorem. The formula P is True for N: VX {(X € N) => [(X U {X}) € N]}. 


Proof. For each X, if X ¢ N = ()F, then X € B for each B € F, but then 
(X U {X}) € B for each B € ¥, whence (X U {X}) €() F =N. oO 


4.10 Theorem. The set N is an element of ¥. 


Proof. Theorems 4.8 and 4.9 show that @ € N and Subf,,(P) are True. Moreover, 
N=()F CI. Thus, (N C I) A [Subf,(P)] holds, whence N € F. Oo 


The concept of “successor” can serve to specify functions defined on N. 


4.11 Example. The “successor” function is defined by 
G:= {(%,Y) € (NxN): Y= (XU {xp}, 
so that 


G:N-N, 
Yes XU 


4.12 Definition (sequence). For each set E, a sequence in E£ is a function F: N > 
E. For each N € N, the value F(N) can also be denoted by F'y, and then the function 
F can also be denoted by (Fy) or (Fy) ven. Also, a finite sequence of length L in E 
is a function S: L — E defined on some L € N. The value S(K) is also denoted by 
Sx; then the function S is also denoted by (Sx) or (Sx) xez- 


4.13 Example. The “successor” function in example 4.11 is a sequence in N. 


For convenience, mathematical usage abbreviates the successor NU{N} as N+ 1. 
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4.14 Definition. For each natural number N, let N + 1 := NU {N}. 


In general, the specification of functions defined on N requires a method known 
as the Principle of Mathematical Induction, as described in the next subsections. 


4.2.2 The Principle of Mathematical Induction 


The following theorem forms the theoretical basis for the methods of proof by 
induction and recursive computation. Specifically, the “Principle of Mathematical 
Induction” shows that if a subset § C N contains @, and if S also contains the 
successor X U {X} of each of its elements X, then S contains all the natural numbers: 
S=N. 


4.15 Theorem (Principle of Mathematical Induction). For each subset S CN, if 
S contains the empty set and the successor of every one of its elements, then S = N. 


Thus, F WS {(S CN) A [ Subfs(P)| => (S =N)}; on, with Subf§(P) spelled out: 
F WS[(S CN) A(@ ES) A (WX {(X ES) = [(XU {X}) € S}}) > (S=N)]. 


Proof. If S C N then S C N = ()¥ C I. If moreover Subfs (P) is True, then 
S € # by definition of #. From S € F, it then follows that ().% C S. Thus, 
(\F% CSC) F, whence (by theorem 3.10) equality holds: S=().A=N. O 


4.16 Remark. The proof of theorem 4.15 derives, or, equivalently, deduces the 
Principle of Mathematical Induction from the preceding axioms and inference rules 
of logic and axioms or postulates of set theory. Therefore, in mathematics, all 
inductive proofs are deductive: 


In mathematical English, the words “inductive” and “deductive” may be used interchange- 
ably for any kind of reasoning, except that the principle listed in [theorem 4.15] is usually 
called the Induction Principle [114, p. 195, Remark]. 


The following example demonstrates a pattern amenable to induction. 
4.17 Example. The elements of N in example 4.3 reveal the following pattern: 


¢ Every element of 0 is also a subset of 0, so that  VX[(X € 0) > (X C 0)]. The 
hypothesis X € 0 is False, whence the implication is universally valid. 

¢ Every element of 1 = {@} is also a subset of 1, because @ € 1 and @ C 1. 

e Every element of 2 = {2, {a} is also a subset of 2. Indeed @ € 2 and @ C 2; 
also, {@} € 2 and {@} C 2. 

e Similarly, every element of 3 = {2, {@}, {Z, {2}}} is also a subset of 3. Indeed 


@ € 3 and @ C 3; moreover, {@} € 3 and {@} C 3; furthermore, {@, {o}\ €3 
and {@, {@}} ¢ 3. 
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The proof of the following theorem shows the use of the Principle of Mathemat- 
ical Induction to verify the pattern of example 4.17 for all natural numbers. 


4.18 Theorem. ForallL ¢ NandN EN, ifL€N thenL CN. 
Proof. This proof proceeds by induction with NV. Define a set S C N by 


S:={NEN: VLII(LEN) > (LCN )}}. 


Initial step 


If N := @, then L € @ is False, whence (L € @) => (L C @) is universally valid; 
thus @ € S, by definition of S. 


Inductive hypothesis 


Assume K € S; thus VL[(L € K) => (LC K)| is True. 


Inductive step 


To verify the theorem for N := K + 1, assume that L € (K + 1) = (K U {K}); then 
LéK or Lé {K}, by definition of the union K U {K}. 

In the case L € K, it follows from the inductive hypothesis that L C K, whence 
LOK C (KU {K}) = (K + 1) and then Z C (K + 1). 

In the case L € {K}, it follows that L = K, whence L = K C (KU{K}) = (K+1) 
and then L C (K + 1). 

Consequently L C (K + 1) in either case. Because L C (K + 1) forevery L € K, 
it follows that (K + 1) €S. 


Completion of the proof by induction 


From the initial step @ € S, and from the inductive step, if K € S then (K + 1) € S; 
it follows from the Principle of Mathematical Induction (theorem 4.15) that S = N. 
Thus N € S for every N € N: forevery N EN, ifLe N thenL CN. Oo 


Theorem 4.18 holds for all natural numbers, but it can fail for other sets, as 
demonstrated in counterexample 4.19. 


4.19 Counterexample. If X = {@} and Y = {{@}} , then X € Y but X Z Y, 
because @ € X but @ ¢ Y . Theorem 4.18 does not apply to Y because Y ¢ N. 


Similarly, theorem 4.20 shows that every natural number is also a subset of N. 


4.20 Theorem. For every set N, ifN € N, thenN CN. 
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Proof. This proof proceeds by induction with N. 
For N = 0 = @, @ CN by theorem 3.11. 
As an induction hypothesis, assume that there exists K € N such that K CN. 
For the induction step, from K € N follows {K} Cc N by theorem 3.22. Also, 
K CN by induction hypothesis. Hence K + 1 = KU{K} CN. Oo 


Theorem 4.20 holds for N but fails for other sets, as already demonstrated in 
counterexample 4.19. 

The following subsection demonstrates how to use the Principle of Mathematical 
Induction to prove the “existence” of certain functions. 


4.2.3. Definitions by Mathematical Induction 


The preceding discussion has introduced the Principle of Mathematical Induction 
as a method to prove theorems and verify formulae involving natural numbers. In 
addition, the following theorem shows how the Principle of Mathematical Induction, 
and the concept of unions of functions, also form the foundation of a method to 
define by induction the values of some functions. The same method is also known 
as recursion [18, p. 322, n. 526], [63, p. 10]; the resulting functions are also called 
recursive functions, with a terminology attributed [49, p. 167] to Kurt Godel [43], 
[46, p. 46]. 


4.21 Theorem (Definition by Induction or Recursion). For each nonempty set C, 
for each element A € C, and for each function G: C — C with domain G(G) = 
C, there exists exactly one function F : N — C that satisfies the following two 
conditions: 


(DMI.0) F(0) = A. 
(DMI.1) F(N + 1) = G[F(N)] for each N EN. 


Proof. This proof verifies by induction that there exists a sequence of functions 
(Fy), with each function Fy C (N x C) defined on the set Sy := (N U {N}) so that 
the following two conditions hold: 


(N.0) Fy(0) = A. 
(N.1) Fy + 1) = G[Fy()] for each I € Sy \ {N} = N. 


Then the proof ends by verifying that the function F defined by F := yen Fn, 
so that F(N) = Fy(N), satisfies all the requirements. 


Initial step 


For N = 0, consider the set So = {0}, and consider the function 
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F 0: {0} _> Cc, 
OKA. 


Then the function Fo just defined satisfies the following two conditions: 


(0.0) Fo(O) =A. 
(0.1) Fo + 1) = G[Fo()] for each I € Sp \ {0} = @. 


Moreover, there exists only one function of singletons Fo : {0} — {A}. 


Induction hypothesis 


Assume that there exists a natural number J € N such that the theorem holds for 
N := J, so that for each L € S; = J U {J} there exists exactly one function F;, 
satisfying the two conditions (NV.0) and (N.1) for N := L. 


Induction step 


Let S C N consist of every N € N for which there exists exactly one function Fy 
satisfying (N.0) and (N.1). 

There exists exactly one function of singletons H; : {J + 1} > {G[F;(V)]}, 
because there exists exactly one function F, and hence exactly one value F7(J/). 

Define F)4, := F,UH,, which is a function because of the disjoint domains 
S7O{J + 1} = @ (by definition 3.94 in chapter 3). 

For J := J+ 1, the definition of F;+) gives Fy4,J + 1) = AyJ+ 1) = 
G|[F;(J)] = G[F7+41(/)]. For each L € Sy; \ {J} = J, the definition of F741 gives 
Fyii(L + 1) = FAL + 1) = G[F,(L)] = G[F;+1(L)] so that the following two 
conditions hold: 


(J + 1.0) F741(0) = F7(0) = A, 
(J+ 1.1) Fy4,(L+ 1) = G|F741(L)] for every L € Sy+1 \ {J+ 1}. 


Moreover, there exists exactly one such function F;4, because there exists exactly 
one restriction F;|s,, namely F’;, and exactly one value for Fy41(J + 1) = G[F;(J)]. 


Completion of the proof of the theorem 


Having constructed the sequence of functions (Fy), define a function F: N > C 
by F(N) := Fy(N). Specifically, let 


F:= |) Fy. 


NEN 


Then verify that 
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(DMI.1) F(0) = F(0) = 0. 
(DML.2) F(N + 1) = Fy4i(N + 1) = G[Fw4i(N)] = G[Fv(W)] = GIF). 


The uniqueness of F follows from the uniqueness of each restriction F|y4, = Fy. 
oO 


Because the proof of the validity of the method of recursion (theorem 4.21) relies 
on mathematical induction (theorem 4.15), recursion is a logical consequence of 
induction. The following sections show how recursion suffices to define arithmetic 
with natural numbers. 


4.2.4 Exercises on Mathematical Induction 


4.1. With 5 := 4U {4}, write 5 in terms of sets, as in example 4.3. 
4.2. With 6 := 5 U {5}, write 6 in terms of sets, as in example 4.3. 
4.3. Prove that every set is an element and a subset of its successor. 


4.4. Provide an example of a nonempty proper subset S & N that contains the 
successor of every one of its elements. 


Prove or disprove each of the following statements for all sets A and B. 
4.5. Prove or disprove that if A C B, then (A U {A}) C (BU {B}). 
4.6. Prove or disprove that (A U {A}) U (BU {B}) = (AUB) U {A UB}. 
4.7. Prove or disprove that (A U {A}) N (BU {B}) = (AN B) U {AN B}. 
4.8. Prove or disprove that (A U {A}) \ (BU {B}) = (A \ B) U {A \ B}. 
Prove or disprove each of the following statements for all K, L,M,€ N. 
4.9. Prove thatif K € MandLeéM,thenK ULE M. 
4.10. Prove thatif kK ¢€ MandLe M,thenKNLEM. 
4.11. Outline a formal proof of theorem 4.4. 
4.12. Outline a formal proof of theorem 4.8. 
4.13. Outline a formal proof of theorem 4.9. 
4.14. Outline a formal proof of theorem 4.10. 
4.15. Outline a formal proof of theorem 4.15. 
4.16. Prove that JN=N. 
4.17. Prove that (|N = ©. 


4.18 . For the successor function, prove that N = G°%(@) for each N € N*. 
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4.19. Prove that if a subset V C N contains 2, and if V also contains the 
“successor” XU {X} of each of its element X, then V contains all the natural numbers 
except 0 and 1: V =N\{@, {@} }. In other words, replace @ by { 9, {9} } in P 
to obtain the logical formula R defined by 


(R) = [1 , {O}} € A) AVX {(X € A) = [(XU {X}) € A] ]. 


Prove that! VV [{(V CN) A [Subf}(R)]} > (V=N\{@, {2} })]. 


4.20. Prove that if a subset U C N contains 1, and if U also contains the 
“successor” X U {X} of each of its elements X, then U contains all the positive 
natural numbers: U = N \ {@}. In other words, replace @ by {@} in P to obtain the 
logical formula Q defined by 


(QO) + [({O} € A) AVX {(X € A) = [(X U {X}) € A}}]. 


Prove that F vu[{(U CN)A [Subf?,(Q)]} =>(U=N\ {@})]. 


4.3 Arithmetic with Natural Numbers 


4.3.1 Addition with Natural Numbers 


This subsection defines the addition of natural numbers and establishes some of 
its properties, all by induction. Besides explaining the foundations of integer arith- 
metic, the following considerations also provide examples of proofs by induction. 


4.22 Definition (Addition). For every M € N, define 


M+0:=M, (AO) 
M+1:=MU {M}. (Al) 


Then for every M € N, define an addition function by induction, so that for every 
NeN, 


M+(N+4+1):=(M+N) +1. (A2) 


4.23 Remark. According to definition 4.22, adding N amounts to adding 1 repeat- 
edly N times. With an alternative but logically equivalent notation, for each M € N 
definition 4.22 uses theorem 4.21 and the successor function 


G:N-N, 
NeN+1, 
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to specify by induction (recursively) an addition function F : N — N such that 


F™)(0) := M, (AO) 
G(L) := L+1, (Al) 
FOO(N + 1) := G[F(N)]. (42) 
The following theorem shows that addition is associative. 
4.24 Theorem. For all P,O,R €N, (P+Q)+R=P+(Q+R). 
Proof. For each P and each Q, proceed by induction with R. For R := 0, 


(P+O0)+0=P+90 by (AO) with M := (P + Q), 
= P + (Q +0) by (AO) with M := Q. 


Second, assume that there exists some K € N such that the theorem holds for R := 
K, so that (P + Q) + K = P+ (Q+K) for each P and each Q; then 


(P+ Q0)+(K +1) = [(P+ Q) + K]+1(A2),M:= (P+ Q),N:=K, 
= [P+ (Q+K)] + 1 induction hypothesis, 
=P+[(0+K) +1] (A2),M:=P,N:= (Q+K), 
=P+[0+(K+1)] (A2),M:=0,N:=K. 
Oo 


The following three theorems show that addition commutes. The first theorem 
shows that adding 0 commutes. 


4.25 Theorem. For eachN €N,0O+N=N. 
Proof. Proceed by induction with N. First, establish the conclusion for N := 0: 


0 + 0 = 0 by (AO) with M := 0. 


Second, assume that 0 + K = K for some K € N, so that the theorem holds for 
N := K; then 


0+ (K+ 1) = (0+ K) + 1 by (A2), 
=K+1 induction hypothesis. 


The second theorem shows that adding 1 commutes. 


4.26 Theorem. For eachPeN,14+P=P+1. 
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Proof. Proceed by induction with P. First, for P := 0, 


14+0=1 by (AO) with M := 1, 
= 0+ 1 by theorem 4.25. 


Second, assume that 1 + K = K + 1 for some K € N, so that the theorem holds for 
P := K; then 


1+(K+1)=(+K)+1 (A2) withM := 1 andN:= K, 
= (K + 1) + | induction hypothesis. 


Finally, the third theorem shows that addition commutes. 
4.27 Theorem. For all natural numbers P and Q,P+Q=Q+P. 
Proof. For each P € N, proceed by induction with Q. First, 


P+0=P by (AO), 
= 0+ P by theorem 4.25. 


Second, assume that there exists K € N such that the theorem holds for Q := K, so 
thatR+ K = K +R for each R € N; then for each P € N, 


P+(K+1) =P+(1+4) induction hypothesis with R := 1, 
= (P+ 1)+ K theorem 4.24, 
= K + (P + 1) induction hypothesis with R := (P + 1), 
= K + (1+ P) theorem 4.26, 
= (K + 1) + P theorem 4.24. 


4.3.2 Multiplication with Natural Numbers 


This subsection defines by induction the multiplication of natural numbers and 
establishes some of its properties, as well as some relations between addition and 
multiplication. Besides a continuation of the foundations of integer arithmetic, the 
following considerations also provide further examples of proofs by induction. 


4.28 Definition (Multiplication). For every nonnegative integer M € N, define a 


multiplication function by induction based on the following specifications: 


Mx*0:=0, (MO) 
M «(N+ 1):= (M*N)+4+M. (M1) 
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4.29 Remark. By definition 4.28, multiplying M and N amounts to starting from 
0 and adding M repeatedly N times. With an alternative but logically equivalent 
notation, for each M € N, definition 4.28 uses a recursive definition with G (in 
theorem 4.21) replaced by the addition function F™ from remark 4.23, 


G:=F :- NSN, 
NPeN+M, 


to specify by induction a multiplication function H™ : N > N such that 


G(L) := F(L) = L+M, 
H) (0) := 0, (M0) 
H(N +1) := G[H™(N)]. (M1) 


4.30 Example. Multiplication by repeated addition, as in definition 4.28, was used 
in the IBM® 604® [64, p.23 & 71] and IBM 650® [65, p.35 & 36, §2]. 


The following theorem shows that multiplication distributes over addition on the 
right-hand side. 


4.31 Theorem. For all P,Q,ReEN, (P+ Q)*R=(P*R)+(Q*R). 
Proof. For each P and each Q, proceed by induction with R. For R := 0, 


(P+ 0)*0=0 (MO) with M := (P + Q), 
=0+0 (AO) with M := 0, 
= (P * 0) + (Q * 0) (MO) with M := P and (MO) with M := Q. 


Second, assume that there exists K € N such that the theorem holds for R := K, so 
that (P + Q) * K = (P * K) + (Q * K) for each P and each Q; then 


(P+ Q)* (K+ 1) = [(P+Q)*K]+ (P+Q) (M1), M:= P+ Q, 

= [(P x K) + (Q * K)] + (P + Q) induction hypothesis, 
= (P* K)+ [((Q* K) + (P+ Q)] theorem 4.24, 

= (P * K) + [{(Q * K) + P} + Q] theorem 4.24, 

= (Px K) + [{P + (Q * K)} + Q] theorem 4.27, 

= (P* K) + [P+ {(Q * K) + Q}] theorem 4.24, 

= [(P* K) + P] + [(Q * K) + Q] theorem 4.24, 

= [Px (K+ 1)] + [Q* (K+ 1)] (M1) twice. 


oO 


The following three theorems show that multiplication commutes. The first 
theorem shows that multiplication by 0 commutes. 
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4.32 Theorem. For each natural number N € N,0*« N = 0. 


Proof. Proceed by induction with N. First, establish the conclusion for N := 0: 
0*0=0 (MO). 


Second, assume that there exists K € N such that the theorem holds for N := K, so 
that 0 x K = 0; then 


O*«(K+1) = (0* K) +0 (M1), 
=0+0 induction hypothesis, 
=0 (AO). 


The second theorem shows that multiplication by 1 commutes. 
4.33 Theorem. For each nonnegative integer NEN, 1* N=N. 


Proof. Proceed by induction with N. First, establish the conclusion for N := 0: 
1*0=0 (M0). 


Second, assume that there exists K € N such that the theorem holds for N := K, so 
that 1 * K = K ; then 


l*x(K+1)=(1*kK)+1 (M1), 
=K+1 induction hypothesis. 


Finally, the third theorem shows that multiplication commutes. 
4.34 Theorem. For all nonnegative integers P,Q €¢ N, P* Q = Q * P. 
Proof. For each P € N, proceed by induction with Q. First, for Q := 0, 


P*x0=0 (M0), 
= 0 « P theorem 4.32. 


Second, assume that there exists K € N such that the theorem holds for Q := K, so 
that P * K = K * P for each P € N; then 


P*(K+1)= (P*K)+P (M1), 

= (K * P)+P induction hypothesis, 
(K x P) + (1 * P) theorem 4.33, 
=(K+1)*«P theorem 4.31. 


oO 


The next theorem shows that multiplication distributes over addition also on the 
left-hand side. 
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4.35 Theorem. For all P,O,R<EN, P*(Q+ R) = (P* QO) +(P*R). 
Proof. Use commutativity and distributivity on the right-hand side (theo- 
rems 4.34, 4.31): 


Px*(Q+R)=(Q+R)*P theorem 4.34, 
= (Q * P) + (R * P) theorem 4.31, 
= (P * Q) + (P * R) theorem 4.34. 


The following theorem shows that multiplication is associative. 
4.36 Theorem. For all P,Q,RéEN, P*(Q* R) = (P*Q)*R. 
Proof. For all P,Q € N, proceed by induction with R. For R := 0, 

Px(Q*0)=Px*0 (MO), 


=O (MO), 
= (P * Q) * 0 (MO) with M := (P+ Q). 


Second, assume that there exists K € N such that the theorem holds for R := K, so 
that P * (OQ x K) = (P * Q) « K forall P,Q € N; then 


(P * Q) * (K + 1)] = [((P * Q) * K] + (P * Q) (M1) with M := (P * Q), 
= [P * (Q * K)| + (P * Q) induction hypothesis, 
= Px |[(Q* K)+ Q] theorem 4.35, 
= P*x[Q* (K+ 1] (M1). 
Oo 


There are other arithmetic operations with natural numbers, such as the factorial. 


4.37 Definition. For each N € N, define N! (read “N factorial’) recursively by 


0! := 1, 
(N + 1)! := (NI) * (N+ 1). 


4.3.3. Exercises on Arithmetic by Induction 


The following exercises involve the following sets (also defined in example 3.35): 


0:= ©, 
1 := {0}, 
2°=40,1}, 


204 


4.21. 
4.22. 
4.23. 
4.24. 
4.25. 
4.26. 
4.27. 
4.28. 
4.29. 
4.30. 
4.31. 
4.32. 
4.33. 
4.34. 
4.35. 
4.36. 
4.37. 
4.38. 
4.39. 
4.40. 
4.41. 
4.42. 
4.43. 
4.44. 
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OmANIDN FW 


Prove that 1 + 1 = 2. 
Prove that 2+ 1 = 3. 
Prove that 3 + 1 = 4. 
Prove that4+1=5. 
Prove that 5+ 1 = 6. 
Prove that 6 + 1 = 7. 
Prove that 7+ 1 = 8. 
Prove that 8+ 1 = 9. 
Prove that 2+ 2 = 4. 
Prove that 3 + 2 = 5. 
Prove that 4 + 2 = 6. 
Prove that 5 + 2 = 7. 
Prove that 6 + 2 = 8. 
Prove that 7 + 2 = 9. 
Prove that 3 + 3 = 6. 
Prove that 4+ 3 = 7. 
Prove that 5 + 3 = 8. 
Prove that 6+ 3 = 9. 
Prove that 4+ 4 = 8. 
Prove that5 + 4 = 9. 


Prove that 2 * 2 = 4. 
Prove that 3 * 2 = 6. 
Prove that 4 * 2 = 8. 
Prove that 3 * 3 = 9. 


(0, 1,2}, 

£0; 1; 2, 3}, 

:= {0,1,2,3, 4}, 

:= {0,1,2,3,4, 5}, 

:= {0,1,2,3,4,5, 6}, 

:= {0,1,2,3,4,5, 6, 7}, 
:= {0,1,2,3,4,5, 6, 7, 8}. 


II 


II 
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4.45. Prove or disprove that addition distributes over multiplication on the left. 
4.46. Prove or disprove that addition distributes over multiplication on the right. 
The following exercises refer to the factorial specified by definition 4.37. 


4.47. Identify a set C and functions F and G that fit theorem 4.21 to justify 
definition 4.37. 


4.48. Calculate 0!, 1!, 2!, 3!. 
4.49. Prove or disprove that (P + Q)! = (P!) + (Q!) for all P,Q EN. 
4.50. Prove or disprove that (P * Q)! = (P!) * (Q!) for all P,Q EN. 


4.4 Orders and Cancellations 


4.4.1 Orders on the Natural Numbers 


The set of natural numbers can model geometric concepts, for instance, a direction 
from left to right, and algebraic concepts, for instance, increasing magnitudes: 


0<1<2<3<4<5<6<7<8<9<... 


For both types of concepts — geometric and algebraic — it suffices to introduce an 
ordering relation on the natural numbers. To this end, this subsection shows that the 
natural numbers are well-ordered by a relation < defined by 


(M <N) @[(M=N)v (MEN). 


From this well-ordering relation will result the laws of arithmetic cancellations, 
which will also allow for the solutions of certain equations. The first results define 
a strict order < in terms of the foundational relation € of set membership. 


4.38 Definition. For all /,N € N, define M < N (read “M is less than N”’) by 
(M<N)@ (MEN). 


4.39 Example. 0 < 1 because 0 € 1:0 = @ and 1 = {@}, so @ € {@}. 
1 < 2 because 1 € 2;1 = {O},2= {Z, {o}}, whence {@} € {2, {@} . 


The following theorem establishes the transitivity of the relation € on N. 
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4.40 Theorem. For all L,M,N €N, ifLe¢ MandM €N, thenLeN: 
VLVMYN{[(LEN)AMEN)A(N EN) 
> K(LEM) AMEN) => (LEN)}}. 


Proof. Proceed by induction with NV. 


Initial step 


If N := 0 = @, then for all nonnegative integers L, M € N the proposition M € @ 
is False, whence so is the conjunction (L € M) A (M € 0), and hence the implication 
[((L eM) A (M € 0)| > (L € 0) is True. 


Induction hypothesis 


Assume that there exists some K € N such that the theorem holds for N := K, so 
that [((L € M) \(M € K)| > (LEK) forall L,M EN. 


Induction step 


IfLeMandMeK+1=K U {K}, then two cases arise: M € {K} or Me K. 

In the first case, M € {K}, whence M = K, and the hypothesis L € M then yields 
LEM=KCKU{K},sothatLe K +1. 

In the second case, M € K, whence the induction hypothesis yields L € K C 
K U{K} = K + 1, and then again L € K + 1. Oo 


The transitivity of the relation € holds on the set N, but it can fail on other sets. 


4.41 Counterexample. If X := 2, Y := {@}, and Z := {{@}', then X € Y and 
Y € Z, but X ¢ Z. Theorem 4.40 fails for Z = {{o}} because {{o}} EN. 


The following theorem shows that on the natural numbers the relation € is neither 
reflexive nor symmetric, but, instead, € is both irreflexive and asymmetric. 


4.42 Theorem. Foralll,L,MeEN,M¢M,andU€éL)v (L€é I). 
Proof. This proof of the first result (M ¢ M) proceeds by induction with M. 


Initial step 


If M = 0, then M = ©. From @ ¢ @ it follows that M ¢ M. 
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Induction hypothesis 


Assume that there exists some K € N such that the theorem holds for M := K, so 
that K ¢ K. 


Induction step 


This step of the proof proceeds by contraposition. If (K U {K}) € (K U {K}), then 
two cases can arise: (K U {K}) € K or (K U {K}) € {K}. 

First, if (K U {K}) € K, then the transitivity of the relation € (theorem 4.40) and 
K € (K U {K}) give K € K, which contradicts the induction hypothesis. 

Second, if (K U {K}) © {K}, then (K U {K}) = K, whence K € {K} C (KU 
{K}) = K gives K € K, which contradicts the induction hypothesis. 

The proof of the second result proceeds by contraposition. For all J ¢ N and 
L €N, the conjunction (I € L) A (LZ € J) and the transitivity of € (theorem 4.40) 
give J € I, which contradicts the result just proved. Therefore, =[(7 € L) A (L € J)]. 

oO 


The following theorem shows that each natural number differs from its successor. 
4.43 Theorem. For each natural number NE N,N #AN +1. 


Proof. From N € {N} C (NU {N}) = N+ 1 it follows that N €¢ (V+ 1). YetN € N 
by theorem 4.42. Therefore NV # N + 1 by the axiom of extensionality (S1). oO 


The following theorem shows that adding | to both sides of a valid inequality 
gives a valid inequality, so that if M < N then (M+ 1) < (N+ 1). 


4.44 Theorem. For all M,N €N, ifM €N, then (M+ 1) €(N+1): 
FF VYMVN (MEN)A(NEN)] > (MEN) S [M4 1) € (V4 I))}). 


Proof. Apply induction with N. For N := 0 = @, and for each M € N, the 
proposition M € @ is False; hence the implication (M € @) > [((M+1) € (N+1)] 
is True. 

Next, assume that there exists K € N such that the theorem holds for VN := K, so 
that (M € K) => [M+ 1) € (V+ 1) holds for every MEN. IFMeEK+1= 
K U {K}, then two cases occur. 

In the first case, M € {K}, and then M = K, whenceM+1= K+1 € 
(K+1)U{K+]}=(K+1)4+1. 

In the second case, M € K, whence M+ 1 € K +1 by induction hypothesis; with 
K+1C (K+ 1)U{K +1} = (K+ 1) +1, it follows thatW+le(kK+1)+1. O 


The next theorem shows that 0 = @ is the smallest natural number. 


4.45 Theorem. For each natural number M €N, if M # 0, then0 € M. 
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Proof. This proof proceeds by induction with M. 

For M := 0, the proposition M # 0 is False, whence the implication (MV 4 0) > 
(0 € M) is True. 

Next, assume that there exists K € N such that the theorem holds for MW := K, so 
that (K 4 0) > (0 € K). Two cases arise: K = Oor K £ 0. 

In the first case, if K = 0, then0 = @ € (@ U{@}) = K U {K} = K + 1; thus, 
0¢ K+ 1 is True. 

In the second case, if K # 0, then 0 € K, by the induction hypothesis, whence 
OE K  CKU{K}=K +1,andhence0 € K + 1. Oo 


The following theorem shows that the relation € is connected on the natural 
numbers: if M 4 N, then either M € NorN € M. 


4.46 Theorem. For all natural numbers M,N € N, exactly one of the following 
three formulae is True, while the other two are False: 


MeN, 
M=N, 
NEM. 


Proof. First, observe that at most one of the three formulae may hold. Indeed, M ¢ 
M by theorem 4.42, whence (M € N) A (M = N) is False, and (M = N) A (N € M) 
is False. Similarly, also by theorem 4.42, (M € N) A (N € M) is also False. Second, 
at least one of the three formulae must hold. This proof uses induction with N. 

For N := 0, and for each M € N either M = 0 = N is True, or M ¥ 0, and then 
0 € Mis True by theorem 4.45. To complete the induction, assume that there exists 
K € N such that the theorem holds for N := K, so that (M € K) vV(M=K)v (Ke 
M) for each M € N, and examine all three formulae. 

IfM eK,thenM eK CK+1,whenceMe K +1. 

If M = K, then M = K € {K} CK U {K}, whence ME K + 1. 

If K € M,thenkK+1¢€M+1=MU {M}, (theorem 4.44); two cases occur. 

In the first case, K +1 € {M}, whence K+1 = M. In the second case, K+ 1 € M 
already. Thus, (M € [K + 1]) v (M = [K + 1]) v ({K + 1] € M) is True. Oo 


The foregoing result completes the proof that < is a strict total order (irreflexive, 
asymmetric, and transitive) on the natural numbers. The following theorem shows 
that the natural numbers are well-ordered by the relation <. 


4.47 Theorem (Well-Ordering Principle). Each nonempty subset of the natural 
numbers has a smallest element. 


Proof. By contraposition, this proof establishes that every subset of the natural 
numbers without a smallest element (definition 3.192) is empty. To this end, assume 
that a subset E C N has no smallest element. Thus every N € N is not the smallest 
element of E, which means that N ¢ E or that E contains an element M such that 
M<N. 
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To prove that E = @, the proof proceeds by induction with the set S C N 
consisting of every N € N such that the complement N \ £ contains every M < N: 


S:= {NEN: (NU{N}) C (N\ Bh. 


Initial step 


The set S = N \ E contains 0; indeed, contraposition shows that if E contained 0, 
then 0 would be the smallest element of E, because 0 < N for every N € N by 
theorem 4.45 and hence also for every N € E. 


Induction step 


Assume that K € S for some K € N; thus, every M < K belongs to the complement 
of E. Contraposition again confirms that if E contained K + 1, then K + 1 would be 
the smallest element of E; consequently, K + 1 ¢ E, whence K + 1 € S. Therefore, 
S=N, whence E=N\S=@. Oo 


4.48 Definition (minimum). For each nonempty subset E C N, the minimum of 
E is the smallest element of E and it is denoted by min(E£). 


In contrast to the Well-Ordering Principle, some nonempty subsets of the natural 
numbers have no maximum. Yet every nonempty subset of the natural numbers that 
has an upper bound (definition 3.198) also has a largest element (definition 3.192). 


4.49 Theorem. Each nonempty subset of the natural numbers with an upper bound 
in the natural numbers has a largest element. 


Proof. If there exists an upper bound M ¢€ N for a nonempty subset E C N, then 
I < M for every J € E. Let S C N be the set of all the upper bounds for E: 


S:={NEN: VI[(7€ E) > (I< N)]}. 


Then M € S by hypothesis on E; in particular, S # @. Consequently, by the Well- 
Ordering Principle (theorem 4.47) S has a smallest element K := min(S). Because 
K €S, it follows that J < K for every I € E. 

Moreover, K € E, as proved by the following induction. If K = 0, then E = {0} 
because K = 0 </ for every J € N \ {0}; hence K = 0 € E. 

Suppose that there exists L € N such that the theorem holds for K := L, so that 
for each E C Nif L = min(S), then L € E. If min(S) = L+ 1, thenL +1 € E, 
for otherwise J < L for every J € E and then L ¢€ S; thus the theorem holds for 
L+1. i) 
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4.50 Definition (maximum). For each nonempty subset E C N with an upper 
bound in N, the maximum of E£ is the largest element of E and it is denoted by 
max(E). 


The following abbreviations occasionally prove convenient. 


4.51 Definition. For all 7 ¢ N and N € N define a set {/,...,N} by 
U7,...,Nh:={K EN: I< K)A(K <N)}, 
which thus consists of all the natural numbers from J through N . Similarly, 
U,...$:= (74+1,74+2,...$:= {kK eN: I< K}, 


which thus consists of all the natural numbers larger than or equal to J . 


4.52 Example. If E := {2,...,7}, then E = {2,3,4,5,6,7}. Also, min(ZE) = 2 
and max(£) = 7. 


4.53 Example. If E := {2,3,...}, then min(£) = 2 but E has no maximum. 


4.4.2 Laws of Arithmetic Cancellations 


This subsection establishes rules to cancel terms in equations with additions or mul- 
tiplications. The material also provides more practice with proofs by mathematical 
induction. The first rule shows how to solve equations of the form M+ 1=L+ 1. 


4.54 Theorem. ForallL,M €N,ifM+1=L+1, thenM = L. 


Proof. IfM+1=L+1, then MU {M} = LU {L}, and the sets on both sides have 
the same elements. In particular, L € (LU {L}) and thus L € (M U {M}), whence 
two cases arise: L € {M}, or Le M. 

In the first case, L € {M}, and then L = M indeed. 

In the second case, L € M; then M ¢ {L}, for otherwise M = L and L € L would 
contradict theorem 4.42. However, M € LU {L}, whence M € L, but that would also 
contradict theorem 4.42. Therefore, this second case does not occur. oO 


The following theorem allows for the cancellation of an additive term N common 
to both sides of an equation of the type M+ N=L+N. 


4.55 Theorem. Forall M,N,LEN, ifM+N=L+N, thenM = L. 


Proof. For all natural numbers L, M € N, proceed by induction with N. For N := 0, 
ifM@+0=L+0,thnM =M+0=L+0=L, by hypothesis and by (AO). 

Second, assume that there exists K € N such that the theorem holds for NV := K, 
so that for all natural numbers L,M ¢ N,ifM+K=L+K thenM = L. 
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IfM+(K+1)=L-+ (K + 1), then the associativity and the commutativity of 
addition lead to (M+ 1)+ K = (L+ 1) + K, whence M+ 1 = L + 1 by induction 
hypothesis, whence finally M = L by theorem 4.54. oO 


The following theorem shows that addition preserves the ordering. 
4.56 Theorem. For all L,M,N €N, ifL <M thenL+N<M+N. 


Proof. For all L and M, proceed by induction with N. 
For N := 0, and for all L and M, if L+0 = M+0, thenL = L+0=M+0=M. 
To complete the induction, assume that there exists K € N such that the theorem 
holds for N := K, so that for all L and M,ifL <MthenL+K <M+K. Hence, 
ifL <M thenL+ 1 <M +1 by theorem 4.44, whence 


L+(K+1)=L4+(1+4) 
=(L+1)+K 
<(M+1)+K 
=M+(1+k) 
=M+(K +1). 


4.57 Theorem. For all M,N €N, ifM #0, thenN <N+M. 


Proof. Set L := 0 in theorem 4.56. If M 4 0 then 0 < M by theorem 4.45, whence 
L <M. HenceN = 0+ N <M+N by theorem 4.56. Oo 


The following theorem forms the basis for the concept of subtraction of a natural 
number from a larger natural number. 


4.58 Theorem. For all M,N <€ N, ifM <N, then there exists exactly one natural 
number L € N such thatM + L=N. 


Proof. This proof establishes the existence and the uniqueness separately. 


Existence 


First, establish the existence of such a number L. For each M € N, proceed by 
induction with N. 
If N := 0, then (M < 0) is False, because (MV < 0) © (M &€ @). Therefore 


+ VM([(M €N) A (M < 0)] > {AL[(L Ee N) A(M+L=N)}}) 


is universally valid, and hence the theorem holds for N = 0. 

For each M € N, assume that there exists K € N such that the theorem holds 
for N := K, so that for each M € N, if M < K then there exists L € N such that 
M+L=K.IfM < (K+ 1)=K U {K}, then two cases arise. 
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In the first case, M € {K}, whence M = K, and then M+ 1 = K + 1, so that 
L = 1. In the second case, M ¢€ K, that is, M < K, and the induction hypothesis 
yields some L € N for which M+-L = K; hence M+(L+1) = (M+L)+1 = K+1. 


Uniqueness 


Second, verify the uniqueness of L, which results from the theorem that if M+ L = 
N=M-+K, then L = K (theorem 4.55). oO 


The following definition specifies the concept of subtraction of a natural number 
from a larger natural number. 


4.59 Definition (Subtraction). For all L,M,N,¢ N, if M < N, thenN-—M:= L 
is the natural number L defined in theorem 4.58 such that M+ L = N 


4.60 Example. 8 — 5 = 3 because 5 + 3 = 8 by exercise 4.37. 


The following theorem shows that multiplication by a nonzero natural number 
preserves the ordering. 


4.61 Theorem. For all L,M,N €N, ifM <Nand0 <L, thnL*xM <LxN. 


Proof. For all nonnegative integers M,N € N, use induction with L, the smallest 
nonzero value of L being 1. ForL:= 1,if M@<N,thenl*M=M<N=1#N. 

Assume that there exists K € N with 0 < K such that the theorem holds for 
L:= K, so that for all natural numbers M,N € N, if MV < N,thenK*«M< KxN. 
Thus, if M < N, then apply theorem 4.56 twice: 


(K+ 1)*M=(K*M)+M<(K*M)+N < (K*N)4+N= (K+ 1) ¥N. 


oO 


The following theorem allows for the cancellation of a nonzero multiplicative 
term on both sides of an equation, which forms the basis for the division of a natural 
number by a nonzero natural number, and the solution of equations of the type 
Lx M=L#N. 


4.62 Theorem. For all L,M,N €N, if0 <LandL*M=L*N, thnM=N. 


Proof. Proceed by contraposition. 
If M # N, then either M < NorN <M. 
IfM <N,thenL*M<L*xNandLx*MAL*N. 
Similarly, if VN <M, thenL*N<L*MandLxMAL*N. Oo 


4.63 Definition (division). For all K,M@,N € N,if 0 < Land K * L = N, then 
define N/L by N/L := K, as defined uniquely by theorem 4.62. 
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4.64 Example. 8/4 = 2 because 2 « 4 = 8 by exercise 4.43. 


4.65 Example. Division by repeated subtraction was used in the Whirlwind I 
computer [82, p. 3-7-3-9]. 


4.4.3 Exercises on Orders and Cancellations 


The following exercises involve the natural numbers (sets) 0, 1,2,3, 4,5, 6,7, 8,9 
defined in example 3.35, page 128, and reviewed in subsection 4.3.3. 


4.51. Prove that 2 <5. 
4.52. Prove that 5 < 7. 
4.53. Prove that 2 < 7. 
4.54. Prove that 3 <5. 
4.55. Prove that 3 < 7. 
4.56. Prove that 4 < 9. 
4.57. Prove that 2 < 8. 
4.58. Prove: there are no natural numbers K > 1, L > 1 with K *« L = 2. 
4.59. Prove: there are no natural numbers K > 1, L > 1 with K « L = 3. 
4.60. Prove: there are no natural numbers K > 1,L > 1 withK *« L=5. 
4.61 . Prove: there are no natural numbers K > 1, L > 1 with K *« L= 7. 


4.62. Determine all the natural numbers K > 1 and L > 1 such that K * L = 6 and 
prove that there are no other such natural numbers. 


4.63. Determine all the natural numbers K > 1 and L > 1 such that K * L = 9 and 
prove that there are no other such natural numbers. 


4.64. Prove that the relation < is a total ordering on N, with 
(M <N) S&[(M=N)v (MEN). 


4.65. Prove: for each nonempty S C N there exists / € S with INS = @. 
4.66. Prove that for each L € N* there exists 7 €¢ L such that IN L= @. 
4.67. Prove that(] ¢ K) V(K €L) Vv (L€¢J) forall/,K,LEN. 

4.68 . Prove that 1 < N for each N € (N \ {0, 1}). 

4.69. Prove that N ¢ N. 

4.70. Prove that N 4 (NU {N}). 
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4.5.1 Negative Integers 


Several operations remain undefined with natural numbers. For instance, the 
ordering relation does not include elements smaller than the empty set, which 
precludes the use of natural numbers to model relations extending in two opposite 
directions. Also, the arithmetic of natural numbers does not contain provisions 
for the “difference” from a larger natural number to a smaller one. Finally, the 
arithmetic of natural numbers does not include any concept of division other than 
special cases. 

There exist several methods to extend the ordering and the arithmetic of natural 
numbers, to allow for elements “smaller” than zero, and for differences of any two 
elements. Such methods begin with the specification of a larger set of “integers” Z 
{from the German “Zahl(en)” for “number(s)’]. 

One method of defining a larger set, outlined by Kunen [74, p. 35], is sufficiently 
general to produce not only the integers, but also the rational numbers and the real 
numbers. Essentially, the method consists in defining the new numbers in terms of 
equivalence classes. For the integers, the strategy consists in introducing the concept 
of the “difference” between two natural numbers M and N by means of the pair 
(M,N). Then relate (M,N) to every other pair (P,Q) with the same “difference” 
between P and Q. Because the concept of “difference” has not yet been defined for 
all pairs of natural numbers, however, a precise definition uses sums instead. 

Two cases arise: either N < M, or N > M. In the first case, if N < M, then 
definition 4.59 specifies their difference J := M — N. If also (M,N) and (P, Q) 
represent the same difference, then P— Q = J = M —N. An equivalent statement 
without subtractions results from adding J to both extremes: 


M-N=J, M=N+4J; 
P-Q=J, P=Q+4+J. 


In the second case, if N > M with J := N—M and also Q— P= J = N—M, then 


N-M=J, N=M4J; 
Q-P=J, Q=P+44J. 


In either case, the statement that (MW, N) and (P, Q) represent the same “difference” 
can be reworded without subtractions but with sums instead: there exists J ¢ N 
with 


¢ eithrM=N+JandP=Q+J(ifM>NandP> Q), 
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M N 


P Q 
* ooN=M+JandQ=P+J (if M < Nand P < Q), 


N M 


Q P 


Swapping coordinates, from (M,N) to (N,M), will amount geometrically to 
reversing an order or a direction, which will amount algebraically to passing from a 
positive to a negative integer. Such considerations lead to the following definition. 


4.66 Definition. Define a relation = on the set A := NxN forall M,N, P,Q € N by 


(M,N) = (P,Q) 
t 


VJ eNA((M=N+J)AP=O+ DID) VIN =M+)NAQ=P+I)}}]: 
there isaJ EN withM =N+JandP=Q+J,orN=M+JandQ=P4J. 


4.67 Example. (3,6) = (5,8) because J := 3 confirms that 6 = 3+ 3 and 8 = 
5 + 3, by exercises 4.35 and 4.37. 


The following theorem provides an equivalent formulation of this relation. 
4.68 Theorem. For all pairs of natural numbers, 
(M,N) = (P,Q) 
¢ 
M+Q=N+P. 
Proof. This proof establishes the two implications (=> and <) separately. 
Assume first that (M,N) = (P, Q). Then two cases can arise. 
If there exists 7 € N with M = N+J/ and P= Q+/, then 
M+Q=(N+D+Q=N+ (4+ Q=N+(Q+D=N+P. 
If there exists ] € N with N = M+J/ and Q = P+/I, then 
M+Q=M+4+(P4+D)D=(M+4+dU4+P)=M4+D4+P=N4+P. 


For the converse, assume that M + Q = N + P. Then two cases can arise. 
If M < N, then there exists 7 € N such that VN = M + J. Hence 
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M+Q=N+P=(M4+D+P=M4+(P4+) 


whence (by theorem 4.55) cancelling M yields Q = P +I. 
If M > N, then there exists J € N* such that M = N + J. Hence 


N+P=M+Q=(N4+D+Q0=N+(O4+D 


whence (by theorem 4.55) cancelling N yields P= Q +I. oO 
4.69 Example. (5,4) = (2,1) because 5 + 1 = 4 + 2, by exercises 4.25 and 4.31. 


Instead of definition 4.66 the derivations can hence utilize theorem 4.68. 
Thus the following theorem shows that the relation = is an equivalence relation 
(definition 3.150). 


4.70 Theorem. The relation = is an equivalence relation on N x N. 


Proof. This proof verifies that the relation = is reflexive, symmetric, and transitive. 


Reflexivity 


For each pair (M,N) € (N x N), the equality M+ N = N+ M holds by the 
commutativity of addition, whence (M,N) = (M,N). 


Symmetry 

If (M,N) = (P,Q), then M+ OQ = N+P, whence OQ + M = P +N, which means 
that (P,Q) = (M,N). 

Transitivity 


If (K, L) = (M,N), then K + N = L+M. If also (M,N) = (P,Q), thenM+Q= 
N + P. Consequently, 


K+(M+Q)=K+(N+P)=(K+N)+P=(L+M)+P=L+(M+P) 


whence (theorem 4.55) cancelling M yields K + Q = L + P, and (K,L) = (P,Q). 
oO 


4.71 Definition (Kunen’s definition of the integers). The set of integers is the set 
Z := (NxN)/ = of all equivalence classes [(M, N)]= for the relation + [74, p. 35]. 


4.72 Example. Setting I := 3 shows that (0,3) = (1,4) = (2,5) = (3,6) = 
(4,7). Thus the pairs (0,3), (1,4), (2,5), (3,6), (4,7) are elements of the 
equivalence class [(0, 3)]~. Similarly, (3,0) = (4,1) = (5,2) = (6,3) = (7,4). 
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Thus the pairs (3, 0), (4, 1), (5, 2), (6, 3), (7, 4) are elements of the equivalence class 
[(3, O)] x. 


Each pair (J, J) € N x N is equivalent to a pair of the type (K, 0) or (0, K). 


4.73 Theorem. /f1 > J, then I,J) = (I—J,0). 
IfI < J, then I,J) = (0,J—J). 


Proof. If > J, then = (J—J) + J and J = 0+ J whence (/, J) = (I—J,0). 
If? < J, then 7 = 0+ J and J = (J —J) + I whence (J, J) = (0,J—J). Oo 


The following diagram shows a few elements from three equivalence classes: 


[(0, 2)|=, [(0, 0)]~, and [(3, 0)].~ relative to the relation = on A := N xN: 


(J, 3) (3,3) (6, 3) 
(0, 2) (2,2) (5, 2) 

(1, 1) (4, 1) 
(0, 0) (3, 0) 


4.5.2 Arithmetic with Integers 


From the preceding definition of integers in terms of equivalent pairs of natural 
numbers — which represent equivalent differences — follows a definition of 
arithmetic in terms of such pairs. 


4.74 Definition (Kunen’s definition of integer arithmetic). Define an arithmetic 
with pairs of natural numbers as follows: 


(M,N) + (P,Q) := (M+ P,N + Q), 
(M,N) * (P,Q) := ((M x P|] + [N * Q], [M * Q] + [N x P]). 


The following theorem verifies that addition and multiplication of pairs com- 
mute. 


4.75 Theorem. For all M,N,P,Q<€N, 
(M,N) + (P,Q) = (P,Q) + (M,N), 
(M,N) * (P,Q) = (P,Q) * (M,N). 


Proof. Apply the commutativity of addition and multiplication with natural 
numbers: 
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(M,N) + (P,Q) = (M+ P,N+Q) 
= (P+M,Q0+N) 
= (P,Q) + (M,N); 


(M,N) * (P,Q) = ({M x P] + [N * Q], [M * Q] + [N * P]) 
([P * M] + [OQ « N], [O * M] + [P * N]) 
([P * M] + [Q * NJ, [P * N] + [0 * M]) 
= (P,Q) * (M,N). 


II 


II 


oO 


The next theorem checks that arithmetic with equivalent pairs yields equivalent 
results: different pairs representing the same difference yield the same sum or 
product. 


4.76 Theorem. For alll,J,K,L,M,N,P,Q €N, if 
(J) = (K,L), 
(M,N) = (P,Q), 
then 
(1,J) + (MN) = (K,L) + (P,Q), 
(I,J) * (M,N) = (K,L) * (P,Q). 


Proof. If U,J) = (K,L), thn J+ L = J+ K. If also (M,N) = (P,Q), then 
M+Q=N+P. Moreover, 


(J) + (M,N) = 7 +M,J +N), 
(K,L) + (P,Q) = (K+ P,L+Q), 


whence 


U@+M)+(L+Q)=(U+L1)+(“@+Q) 
= (J+ K)+(N+P) 
= (J+N)+(K+P), 


which means that (/, /) + (M,N) = (K,L) + (P, Q). For the multiplication, 
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(I,J) * (M,N) = (0 * M] + [J * N], [ * NJ +  * M)), 


(K, L) * (P,Q) = ([K * P] + [L * Q], [K * Q] + [L* P]), 
so that (I, J) * (M,N) = (K,L) « (P, Q) if and only if 
([I * M] + [J * NJ) + ([K * Q] + [L* P]) = (2 * N] + [J * MJ) + ([K * P] + [L* Q)). 
Additions and cancellations (theorem 4.55) give 


IxM+J*N+K*Q+L*P=I*N+J*M+K*P+L*Q 
piped eu ia Geta ce 
ee ee ree er 
te Te ee eee ee ee 
ee ere ee 
t 
ITxP+J*xN4+J*P+L*xP=J*M+Kx*«P+J*xP4+J*«@Q 
ee te re re nee 
Ee te eee Te ee 


¢ 
0=0, 


which is universally valid, whence (/, J) * (M,N) = (K,L) * (P,Q). Oo 


Hence the following definition specifies an arithmetic with equivalence classes 
based on the arithmetic of representative pairs. 


4.77 Definition (Integer addition and multiplication). 
(((M, N)]=) + ([(P. QJz) := (M+ P, N+ OD), 


(((M, N)]~) * ([(P, O)]+) = ([(M* P+N*O,M*O4+N x P)js). 


4.78 Theorem. For Z, and for all K,L,M,N,P,Q€N, 
addition commutes, 
addition is associative, 
[(0, 0)]~ is the additive unit: [(M, N)|= + [(0, 0)]~ = [(M, N)]-, 
[(N, M)]= is the additive inverse: |((M,N)|= + [(N,M)]= = [(0, 0)]=, 
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multiplication commutes, 

multiplication is associative, 

multiplication distributes over addition, and 

[(1, 0)]~ is a multiplicative unit: [(M, N)|= * [(1, 0)J~ = [(M,N)]=. 


Proof. The proof forms the object of exercises. Oo 
The following definition specifies a subtraction. 


4.79 Definition (Integer subtraction). Define a subtraction of integers by 


(1. N)]-) — ((P, De) == ((M.NJ]-) + (Pla). 


which is a binary operation — from Z x Z to Z. Define an “opposite” by 


—([(P, Q)J=) = ([(Q, P)l=). 


which is a unary operation (unfortunately also denoted by —) from Z to Z. 


4.80 Example. Reducing every pair (J, J) to either form (J — J, 0) or (0, J — J) can 
facilitate calculations: 


[2, 3)J= + [6, Dla = [0, Dla + [4,02 


[(2, Dl -[(7, 3)J= = [2, D]= + [6,7] 
= (1, 0)]-= + [0,4]= 
= [(1,4)]~ 

[(0, 3)]= 

= —[(3, 0)]=; 


II 


(3, Dla * 12, le = [2, Da * 1, 3)]= 
= [((2*0+0%*3,2*3+0*0)Jo 
= [(0, 6)]- 

—[(6, 0)]~. 


4.5.3 Order on the Integers 


The order < on N extends to Z by a definition of “positive” and “negative” on Z. 
By theorem 4.73, every integer [(M, N)]~ is the equivalence class of a pair (J, 0) 
or (0,7) for some J € N. In the first case M— N = I—0 > O, so that M—N 
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is a nonnegative difference. In the second case N - M = I — 0 = 0, so that the 
opposite, N — M, is a nonnegative difference, and then M — N represents a non- 
positive difference. 


4.81 Definition. An integer [(M/, N)]~ is positive if and only if M > N; an integer 
[(M, N)]~ is negative if and only if M < N. 


4.82 Definition. The sets of nonzero integers (Z*), negative integers (Z* ), positive 
integers (Z7,), non-positive integers (Z_), and nonnegative integers (Z+) are 


Z* = Z\ {[(0, 0s}, 


Z_ := {{(M,N)|2 €Z: M <N}, 

= {[(M,N)]~ € Z: U{d € N) A [(M,N) = O,D]}}, 
Z4 := {{((M,N)|2 €Z: M>N}, 

= {(M,N)]2 €Z: AK EN) A [(M,N) = (1, 0)]}}}, 
Z* := {{(M,N)|2{ €Z: M <N}, 

= {(M,N)]- € Z: AK € N*) A [WN) = OD}, 
Z* := {((M,N)|2 ¢Z: M>N} 


{ 
{[((M, N)]~ € Z: A{( € N*) A [(M,N) = (1,0)}3}. 


The following relation on pairs will lead to an order on equivalence classes. 


4.83 Definition. Define a relation < on N x N as follows: 


(M,N) < (0,0) 


t 
M<wN 


and 


(M,N) < (P,Q) 
t 
(M,N) + (Q,P) < (0,0) 


t 
M+Q<N+P. 


The following theorem verifies that the ordering does not depend on the choice 
of the pairs representing the equivalence classes. 
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4.84 Theorem. For alll,J,K,L,M,N,P,Q €N, if 
(I,J) = (K,L), 
(M,N) = (P,Q), 


then 


(I,J) < (M,N) 


y 
(K,L) < (P,Q). 


Proof. If I,J) = (K,L), then] + L = J+ K. If also (M,N) = (P,Q), then also 
M+Q=N-+P. Consequently, adding L + P to each side of the inequalities yields 


(I,J) < (M,N) 
t 
I+N<J+M 
t 
(i+L)+(N+P)<(L+P)+U+M) 
t 
(J+ K)+(M+Q) <(L+P)+VU+M) 
t 
(J+ M)+(K+Q)<(L+P)+U+M) 
t 
K+Q<L+P 


¢ 
Kiya C0). 


A definition of an order on Z can thus use any pair from an equivalence class. 


4.85 Definition. Define an order < on Z by 


[((M,N)]= < [(P, Q)]= 
¢ 
(M,N) < (P,Q) 


t 
M+Q<N+P. 


Theorem 4.86 shows that < is a strict total order (definition 3.187). 
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4.86 Theorem. The relation < is a strict total order on Z x Z. In particular, if 
[(K, l= < [(M.N)]= 

and 
[(M,N)]= < [(P,D]-, 

then 
[(K, = < [(P, =. 


Proof. This proof verifies that < is irreflexive, connected, and transitive. 


Irreflexivity 
(The relation does not relate any element to itself.) For each pair (M,N) « NxN, 
M+N #4N+M, whence ({(M,N)]~) € ([((M, N)] 4). 
Connectedness 
(Each element is related to every different element.) If ([(M@,N)|=) + 
([(P, Q)]=), then M+ OQ ~# N+ P whence either M+ Q < N + P, and then 
(((M,N)]=) < ([(P, Q)]=), or M+Q > N+P, and then ([(M, N)]=) > ([(P, Q)]=). 
Transitivity 
If (((K, L)]=) < ((M,N)]=), then 

K+N<L+M; 
if also (((M, N)|=) < ([(P, Q)]=), then also 

M+QO<N+P. 
Consequently, adding the preceding two inequalities gives 

(K +N) + (M+ Q) < (L+M) + (N + P) 

whence the commutativity and associativity of addition yields 


K+[(M+N)+Q]<L+[(M+N)+P] 
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whence (by theorem 4.56) cancelling M + N yields 
K+Q<L+P 


so that [(K, L)]~ < [(P, Q)]=. Oo 
The next theorem shows that multiplication by a positive integer keeps the order. 


4.87 Theorem. /f 
[((M,N)]= < [(P,Q)]=, 
[(0,0)J~ < [U,D]= < [(K.D]=, 
then 
(IU, D)= * (WM, )]=) < (U0. Dl *[P.Q)l=). 
whence, if also (P,Q) > (0,0), then 
(IU, Dl= * [(W,N)]2) < (((K, DD] * [(P, D=). 
Proof. The proof uses the equivalence (/, J) = (J — J, 0). First, from [(M,N)]= < 
[(P, Q)]= it follows that M+ Q < N + P by definition of < on the pairs. Hence a 
multiplication throughout by J — J > 0 yields 
(-—J) * (M+ Q) < I -J)* (N+P) 
whence 
(I,J) * (M,N) = (I—J,0) * (M,N) 
= (7 -—J] * M, [I -—J] * N) 
< (J —J] * P, [I —J] * Q) 


= (I,J) * (P,Q). 


Swapping the roles of (J, J) with (K, L), and (M, N) with (P, Q), then yields 
(I, J) * (P,Q) = (P,Q) * (I,J) < (P,Q) * (K,L) = (K,L) * (P,Q). 
Consequently, 


(I,J) * (M,N) < UJ) * (P,Q) < (K,L) * (P,Q). 
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The following theorem shows that the square of every integer is nonnegative. 
4.88 Theorem. For every X € Z, X * X > 0. 


Proof. For every X € Z there exists K € N such that X = [(K,0)]~2 or X = 
[(0, K)]=, by theorem 4.73. However, 


(K,0) * (K,0) = (K* K+0*«0,K *0+0xK) 
= (K « K,0) 
= (0*«0+K*K,0*K+K *0) 
= (0, K) * (0, K) 


and in either case X *«X = [(K * K,0)]~ > [(0,0)]~ becauseKxK+0>0+0. O 
4.89 Example. (—[(1,0)]=) * (-[C, )J=) = (0, Dl=) * CO, Dle) = [0 =. 


An alternative method to define all the integers specifies from N a similar but 
disjoint set Z* for all the “negative” integers, and then defines Z := Z* UN. For 
instance, apply the axioms of the power set, pairing, and separation to define 


Z* := {K € P(N): IN[(N EN) A (NF 0) A ({K = {N})]} 
= {{N}: NeN*} 
= {..., UD {Oh} HOH 
= fee toils 


Then change the notation to define —N := {N} for each N € N*. In particular, 
Z* 1N = @. Indeed, @ € N by theorem 4.45 and @ ¢ {N} for each positive 
natural number N € N \ {@}. Consequently, {NV} ¢ N for every {N} € Z*. 


4.90 Definition (Landau’s definition of integer arithmetic). The set Z := Z* UN 
is the set of integers. Arithmetic extends from N to Z by setting 


(—M) + (-N) := -—(M+N), 
(—N) ++M:=M +(-N) :=—-(N-—M) ifM <N, 
(-N)+M:=M+(-N):=M-—N ifN <M. 


II 


Similarly, 


(—N) * (-M) := M «N, 
(—N) * M := M x (—-N) := —(M * N). 


Moreover, the ordering € on N extends to an ordering Z by the definition 


(—N) < (—M) if and only M < N, 
(-N) < K 


for all K € Nand M,N «€ N*. 
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The addition and multiplication thus defined for integers remain associative and 
commutative, multiplication distributes over addition, N-+0 = N and 1*N = N for 
each integer N € Z. Proofs proceed by cases, depending on the sign of the operands, 
and are straightforward but lengthy. See [76, Ch. IV] or the exercises. 


4.91 Remark. Common usage abbreviates each nonnegative integer [(/,0)]= by J. 
Also, any variable can denote an integer, for instance, M = [(P, Q)]=. 


4.5.4 Nonnegative Integral Powers of Integers 


The J-th power M’ of an integer M € Z is the product M * --- * M of J factors M. 
Specifically, from the convention M° := 1, induction produces higher powers. 


4.92 Definition (Integral powers). For each integer M, define 
M® := 1. 
Then for each integer M and for each nonnegative integer J define 
M!t! := (M’) « M. 


In M’, the number M is the base while J is the exponent. 


4.93 Remark. According to definition 4.92, the Jth power of M amounts to J 
multiplications of M, beginning with 1. With an alternative but logically equivalent 
notation, for each M é€ N definition 4.92 uses a recursive definition with G in 
theorem 4.21 replaced by the multiplication function H™ from remark 4.29: 


G:=H™ : NN, 
LeM#L, 


to specify by induction an exponentiation function E™ : N > Z, J MY’, with 


G(L) := H(L) = M *L, 
EM (0) := 1, 
EMT +1) := G[EM()]. 


4.94 Example. Here are the first four nonnegative powers of the integer 2: 


= 1 

2) = (2) *2=1*2=2; 
P= O42 242=S4 
Pa Oye 4en = 8, 
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The following theorem establishes relations between product of bases, sums of 
exponents, and integral powers. 


4.95 Theorem. For all M € Zand N € Z, and for alll € NandJ €N, 


(M * Ny’ = (M") « (N’), 
Nt 2 (N’) * (N’). 


Proof. This proof proceeds by induction with the exponent J. The first equation, 
(M « N)’ = (M’) x (N’), holds for all integers M and N, and for J € {0, 1}: 


(M * N)° =1=1%*1= (M°) « (N®); 
(M«N)!' =Mx*N = (M') « (N'). 
Hence, to prove that (M « N)’ = (M’) « (N’) for every J € N, let 


$2 
{J EN: VMVM{[(M EN) A(N EN)] > [(M * NY! = (M’) « (NIH. 


Thus, assume that there exists K € S, or, equivalently, K € N such that the theorem 
holds for J := K; then 


(M « N)K*+! = [(M*N)‘]*(M*N) _ definition 4.92, 
= [(M*) * (N*)] * (M * N) induction hypothesis, 
= [(M¥) « M] * [(N*) * N] associativity and commutativity, 
= (MET!) & (NKT!) definition 4.92 twice, 
whence K + | € S, and hence, S = N, which means that the first equation holds. 
The second equation, Nt’ = (N!) * (N/), holds for each integer N, for each 


nonnegative integer /, and for J € {0, 1}: 


N'+0 — NI = (N!) * 1 = (N’) « (N°), 
NI+! = (N!) * N = (N’) « (N!). 


Next, let 
S:={JEN: VIVN[NY = (W) * (NY) 


and assume that there exists K € S, or, equivalently, K € N such that the theorem 
holds for J := K, so that N/** = (N’) x (N*) for all 7 € N and N € N; then 


228 
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NEMETD = NETO associativity and commutativity of +, 
= (N'*!) x (N*) induction hypothesis, 
= [(N’) * N] *« (N¥) definition 4.92, 
= (N’) « [(N*) * N] associativity and commutativity of *, 
= (N’) « (NST!) definition 4.92, 


whence K + 1 € S, and, consequently, S = N, which proves the second equation. 


oO 


4.5.5 Exercises on Integers with Induction 


4.71. 
4.72. 
4.73. 
4.74. 
4.75. 
4.76. 
4.77. 
4.78 . 
4.79 . 
4.80. 
4.81. 
4.82. 
4.83. 
4.84. 
4.85. 
4.86. 
4.87. 
4.88. 
4.89. 
4.90. 


Calculate [(2, 4)]~ + [(6, 3)]= . 

Calculate [(5,3)]= + [U, 6)]= . 

Calculate [(3, 1)]~ — [(5, 2)]= . 

Calculate [(7, 3)]~ * [(2, 5)]= . 

Prove that Kunen’s addition commutes. 

Prove that Landau’s addition commutes. 

Prove that Kunen’s addition is associative. 

Prove that Landau’s addition is associative. 

Prove that Kunen’s multiplication commutes. 

Prove that Landau’s multiplication commutes. 

Prove that Kunen’s multiplication is associative. 

Prove that Landau’s multiplication is associative. 

Prove that Kunen’s multiplication distributes over addition. 
Prove that Landau’s multiplication distributes over addition. 
Prove that subtraction does not commute. 

Prove that subtraction is not associative. 

Prove that multiplication distributes over subtraction. 

Prove that subtraction does not distribute over multiplication. 
Prove that subtraction does not distribute over addition. 


Prove that addition does not distribute over subtraction. 
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4.6 Rational Numbers 


4.6.1 Definition of Rational Numbers 


Some practical situations involve comparisons of proportions. Integer arithmetic 
does not allow for proportions, but a method similar to that for passing from natural 
numbers to differences also leads from integers to proportions. As a pair of natural 
numbers (/, /) can represent a difference, a pair of integers (P, Q) can represent a 
proportion. 

For example, the density of the universe at one location involves the mass or 
number of particles P in a volume Q, which can be summarized by the ordered pair 
(P, Q). The density at another location involves the mass or number of particles M 
in a volume N, as summarized by the pair (M,N). A comparison of the densities 
(P,Q) and (M,N) can proceed through a multiplication by a common factor, for 
instance, Q or N. 

Specifically, in a volume WN times larger than Q, the first density of P particles in 
a volume Q becomes P * N particles in a volume Q * N, or (P * N,Q * N). 

Likewise in a volume Q times larger than N, the second density of M particles in 
a volume N becomes Q * M particles in a volume Q * N, or (Q * M,Q * N). 

Both densities (P* N, QO * N) and (Q* M, O* N) refer to the same volume Q « N; 
thus, they are identical if and only if their masses equal each other: P* N = Q* M. 

The foregoing reasoning leads to a relation between pairs of integers. 


4.96 Definition. On the set Z x Z* of all pairs of integers (P, Q) with Q 4 0, define 


a relation = by 


(P,Q) = (M,N) 


t 
PxN=Q*M 


The relation = on Z x Z* in definition 4.96 differs from the relation = on N x N 
in definition 4.66. Nevertheless, it is also an equivalence relation (definition 3.150). 


4.97 Theorem. The relation = in definition 4.96 is an equivalence relation on 
Zx Z*. 


Proof. This proof checks algebraically that = is reflexive, symmetric, and transitive. 


Reflexivity 


For each pair, (/, J) = U, J) because I * J = J * I. 
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Symmetry 


If 7, J) = (K,L), then / * L = J * K by definition of = whence K * J = L * I by 
commutativity, so that (K, L) = (/,J). 


Transitivity 
If 
(I,J) = (K,L), 
(K,L) = (M,N), 
then 
IxL=JxK, 
KxN=LxM, 


whence multiplying the left-hand sides and the right-hand sides together gives 


(I * L) * (K * N) = (UJ * K) * (L* M), 
(I * N) * (L* K) = (J * M) * (K * L), 


and hence cancelling the nonzero factor L *« K = K x L yields 
I*N=J*M 


which means that (/, J) = (M,N). Oo 


4.98 Definition. The set of rational numbers, denoted by Q (for “quotients’’), is the 
set of all equivalence classes for the relation =. A rational number is an element 


of Q. 

The equivalence class of a pair (/, /) is called the ratio of J to J, and it is denoted 
by [U, J)], or also by Z/J, or also by 7 A fraction is a pair (7, J) € (Z x Z*), where 
T is the numerator and J is the denominator. 


The following theorem provides a means to select a specific fraction from a 
rational number, or, equivalently, a specific pair (M,N) from an equivalence class 


P/Q. 
4.99 Theorem. For each P/Q € Q, there exists M/N = P/Q such that 
M=min{fleEN: V[J Ee N)ACU/J = P/Q)]}. 


Proof. This proof applies the Well-Ordering Principle to the set of all nonnegative 
numerators of a rational number. To this end, for each P/Q € Q define the set 
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E:={1EN: [Je N)A (I/J = P/O}. 


Then E 4 @, because if P > 0 then P/Q = P/Q, whence P € E, whereas if 
P < 0 then (—P)/(—Q) = P/Q, whence (—P) € E. The Well-Ordering Principle 
(theorem 4.47) then guarantees the existence of a first element in E. oO 


4.6.2 Arithmetic with Rational Numbers 


The comparison of two rational numbers P/Q and M/N can proceed through 
equivalent fractions with a common denominator, for instance, (P * N)/(Q * N) and 
(Q «x M)/(Q * N). Common denominators also lead to an arithmetic with fractions 
(ordered pairs) and then with rational numbers (equivalence classes). 


4.100 Definition. For all pairs (/, /) and (K, L) in Z x Z*, define functions + and 
* on (Z x Z*) x (Z x Z*) by their counterparts + and * already defined on Z x Z*: 
(I,J) + (K,L) := (I * L] + [J * K], J*L), 


(UI, J) * (K,L) := I * K, J*L). 


The symbols + and « on the left-hand sides are the functions being defined, whereas 
the symbols + and * on the right-hand sides are the addition and multiplication of 
integers. Yet common usage employs + and * for both. 


The following theorem shows that equivalent fractions lead to equivalent results. 


4.101 Theorem. /f 


U,J) = (K,L), 


(M,N) = (P,Q), 

then 
UJ) + (M,N) = (K,L) + (P,Q), 
(I, J) « (M,N) = (K,L) * (P,Q). 


Proof. This proof proceeds through algebraic verifications. By hypotheses, J * L = 
J * K and M x O = N x P, whence 


UJ) + (M,N) = U*N+J*M,J*N), 


(K,L) + (P,Q) = (K* O+Lx*P,L*Q), 
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where 


(lx N+J*M) *(L* Q)—(J*N) *(K* O+L* P) 
=[[*NxL*xQ+J*xMxLxQ)|—|[Je*N*«K* O4+S*N «Lx Pl 


= [(J * L) x (N * Q) + (J * L) * (M * Q)] 
—[(VJ * K) * (N * Q) + (J * L) * (N x P)] 


= [(/ * L) x (N * Q) + (J * L) * (M * Q)] 
—[(U * L) x (N * Q) + (J * L) * (M * Q)| 


=0. 
Thus, (/, /) + (M,N) = (K,L) + (P, Q). Similarly, 


(I,J) * (M,N) = I * M,J *N), 
(K, L) * (P,Q) = (K * P,L * Q), 


where, by hypotheses, 


(I * M) * (L * Q) — (J * N) * (K x P) 
= (I * L) x (Mx Q)—(J * K) * (N x P) 
= (I * L) x (M * Q) — (1 *L) * (M * Q) 
= 0. 


Thus, (I,J) * (M,N) = (K,L) * (P,Q). O 


4.102 Definition. Define the addition and the multiplication of rational numbers by 


IK _UeL+JaKk 
ie oe ic 
I K__I*K 
SS ; 
a a ey: 


The following theorems establish algebraic characteristics of rational arithmetic. 
4.103 Theorem. The addition of rational numbers commutes. 
Proof. For all [/J and K/L in Q, 


IK _I*xL+J*K L*I+K+*J_K I 
7° L- J*L ~ LJ Ee 
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4.104 Theorem. The multiplication of rational numbers commutes. 


Proof. For all I/J and K/L in Q, 


I K I*«K KxI kK I 
ko = = =—*-. 
J OL Jx«xL LxJ L oJ 


4.105 Theorem. The addition of rational numbers is associative. 


Proof. For all I/J, K/L, and M/N in Q, 


I K\ M_I*L+J*K  M 
a 2 N J*L N 


[(7*L+Jx* K)*xN|+[U*L) * MM] 
(J *xL)*N 


_ [x (L* N) + J * (K *N)] + [J * (L* M)] 
7 J x (L* WN) 


_ «(Le N)] + * (K*N+L*M)| 
7 J « (L* N) 


I K*xN+L*«M 
J LxN 


Es KM 
J LN) 


4.106 Theorem. The multiplication of rational numbers is associative. 


Proof. For all I/J, K/L, and M/N in Q, 


II 


IK M IxK M 

* eI = * 

J OL N JxL N 
_ U*K)*M 
~ (J*L)*N 
_ I*(K*M) 


~ J* (L* N) 
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I KxM 
=x 
J L*N 
I K M 
— i — — 
J LN 
oO 


4.107 Theorem. For each N € N* and each I/J € Q, multiplications of the 
numerator and denominator by a nonzero common factor yields the same rational 
number: I/J = (I * N)/(J * N). 


Proof. Verify the criterion for equivalent fractions: [* (J*N)=J*(U*N). OQ 
4.108 Theorem. The multiplication of rational numbers distributes over addition. 


Proof. For all I/J, K/L, and M/N in Q, 


I K M IxL+J*xK M 
* = * 
N J«L N 


— U*xL+J*K)*M 
7 (J « L) *N 


_ U*L)*M+(*K)*M) 
7 J «x (L* N) 


_ UI *M)*L+J%*(K *M) 
7 J * (L* N) 


_ Ui * M) * (L* N) + J * N) * (K * M) 
7 (J * N) * (L* N) 


TaN KxM 
a L*eN 


(G+m)*(z*7) 


The following theorem shows that adding 0/1 does not produce any change. 
4.109 Theorem. For each K/L € Q (K/L) + (0/1) = (K/L). 
Proof. (K/L) + (0/1) = (KK * 1+ Lx O]/L * 1) = (K/L). Oo 


Thus, the rational number 0/1 plays the same role as the integer 0 in additions. 
The following theorem shows that each rational number has an additive inverse. 


II 


www.pdfgrip.com 


4.6 Rational Numbers 235 


4.110 Theorem. Each I/J € Q has an additive inverse: (I/J) + ([-I|/J) = (0/1). 
Proof. 


ies Gee le) 
JJ J*xJ 


JxI+Jx* (-l) 
JxJ 


Jx« [I+ (-D] 
Jx«xJ 


JxJ 


_ Ox VJ) 
~ 1x (JJ) 


0 
I 


The following theorem shows that multiplying by 1/1 changes nothing. 
4.111 Theorem. For each K/L € Q (K/L) * (1/1) = (K/L). 
Proof. (K/L) * (1/1) = ([K * 1]/[L * 1]) = (K/L). Oo 


Thus, the rational number 1/1 plays the same role as the integer | in multiplica- 
tions. Similarly, each nonzero rational number has a multiplicative inverse. 


4.112 Theorem. EachI/J € Q such that I 4 0 has a multiplicative inverse: (I/J)* 
J/D = (1/1). 
Proof. 
I J Te Ie 1S) 1 
——— = = =-, 
J oI JxI IxJ 1*xUTx*J) 1 


Rational arithmetic thus satisfies the algebraic properties in table 4.115. 


4.113 Definition (Field). A field (of numbers) is a set F with at least two different 
elements 0 and 1, so that 0 # 1, and binary operations + and x, satisfying the 
algebraic properties in table 4.115. 
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Table 4.115 These properties hold for all 1/J, K/L, P/Q € Q. 


(1) | Associativity of + [(U/J) + (K/L)] + (P/Q) = U/J) + (K/L) + (P/Q)] 
(2) | Commutativity of + (/J) + (K/L) = (K/L) + U/J) 

(3) | Additive identity (K/L) + (0/1) = (K/L) = (0/1) + (K/L) 

(4) | Additive inverse (K/L) + ({[—K]/L) = (0/1) 

(5) | Associativity of * | [U/J)(K/L(P/Q) = (1/(K/L)(P/9)| 


(6) | Commutativity of * (U/J)(K/L) = (K/L)U/J) 
(7) | Multiplicative identity | (K/L)(1/1) = (K/L) = (1/1)(K/L) 
(8) | Multiplicative inverse | If (K/L) 4 0, 
then (K/L)(L/K) = (1/1) 
(9) | Distributivity (1/DK/L) + (P/O)] = [/I(K/D)) + (U/D(P/O)| 


Technically, the pair (+, «) suffices to identify the set F and the elements 0 and 
1, but for emphasis a field can be defined as the quintuple (F, +, 0, *, 1). 


4.114 Example. The quintuple (Q, +, 0, *1) is a field (of numbers). 
The next theorem forms the basis for a concept of division of rational numbers. 


4.116 Theorem. For each K/L € Q, and for each I/J € Q such that I # 0, 
[(K/L) * (J/D)] * (/J) = (K/L). 
Proof. 


rt gia Ket FP _ Kedel Keel) _ Kees) _&K 
Lo oI} J Let J (Le Det Le(*J) Lees) Lo 


oO 


4.117 Definition. For each K/L € Q, and for each I/J € Q such that J ¥ 0, define 


(i) -@)-@) 
(7) = (2) +(e). 
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4.6.3 Notation for Sums and Products 


The notation introduced here proves convenient to define and investigate sums and 
products of finite sequences of numbers. 


4.119 Definition. A finite sequence of rational numbers is a function S$: N > Q 
defined on some N € N. The value S(K) is also denoted by Sx; then the function S$ 
is also denoted by (Sx). 


4.120 Example. The function S : 9 — Q defined by Sx := (7/3)* is a finite 
sequence of numbers: 


So = (24)° = 1, 

Ss = Cay = %: 

S. = (25)? = 4h, 

S3 = (24)? = 8, 

S4 = (25)4 = 'g1, 
Ss = (24)? = 3%43, 
$3 = (24)° = “4/90, 
Ss = (25)) = 18/187, 
S5 = (24)8 = 2566561. 


The next definition gives a notation for the product of a finite sequence of 
numbers. 


4.121 Definition (Product notation). For each finite sequence of numbers 
S: N > Q, define the “empty product” to be 1: 


[ [5« = 1, 


K<0 


Then define the product of the first value to be the first value: 


Hence for each L € N, such that 0 < L < N, define the product of the first L values 
of the sequence S “inductively” [77, p.5] or “recursively” [49, p. 133] by 


L L-1 
[] S«:= (1 s] * Sp. 
K=0 K=0 
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4,122 Example. Consider the finite sequence S : 9 > Q defined by Sx := (2/3): 


Tx<o Sx =1, 
Tao Sk = So 
= 1, 


[ees Sx = (ees Sx) *« S| 
= (1) * %,, 


le ie= (Ts Sx) + So 
= (1 * 2/5) x 4%, 


[Ease = (Tx=0 Sx) * S3 
= (1 * 24 x 4/o) * 8/7, 


Mx=0 Sk = (ess Sx) * S4 


= (1 * 2/4 * 4/y * 8/7) * 16/g), 


Tle=o Sx : (Tlk=o Sx) * Sg 


= 2,4, 8,16, 32, 64, 128 256 
= (a a 5 5S BT a * 0 * aig) * S561" 


The next definition gives a notation for the sum of a finite sequence of numbers. 


4.123 Definition (Sum notation). For each finite sequence of numbers S: N > Q, 
define the “empty sum” to be 0: 


S > SK = 0. 


K<0 


Then define the sum of the first value to be the first value: 


0 
> Sx := So. 
K=0 


Hence for each L € N, such that 0 < L < N, define the sum of the first L values of 
the sequence S inductively by 


L Lei 
i= (>: s] + Sp. 
K=0 K=0 
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4,124 Example. Consider the finite sequence S : 9 > Q defined by Sx := (2/3): 


Ye x<0 Sx = 0, 
ye Sk = So 
al Be 


Laie (es Sx) +S; 
= (1) + %, 


yee = ae Sx) + 82 
= (1+ 24)4+ %, 


y ihe= (Sr) Sx) + S3 
= (14+ 244 4%) + 8/7, 


eee Sk = (Dkeo Sx) + Sa 
= a + 2/3 + 4/y + 8/r7) + 16/g1, 


_ a 16, 32 , 64 1 128 256 
ad ow + 9+ a9 + gt eee + Ser: 


The pattern in the foregoing example, called a geometric series, is amenable to 
an alternative formula, which expresses the entire sum as one ratio. 


4.125 Theorem (geometric series). For every N € N* and every X € Q \ {1}, 
y XE = 
Proof. This proof uses induction with NV. For N := 1, and for every X € Q \ {1}, 
1-1 0 
1-x! 
xk — xX*® = x? = 1 = . 
ea = i-x 
K=0 K=0 


Assume that there exists J € N* such that the theorem holds for N := J and for 
every X € Q \ {1}, so that 


=e 
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See me 


Then 


(I+1)-1 I-1 
s xk — (ex) +x 
K=0 


K=0 
1-X' sy! 
1-X 


1—X’ = (1—X) «Xx! 
1-X 1-X 


_ (1 — xX’) 4 (x? — X!+!) 
7 1-X 


1—xtl 
1-xX 


oO 


An alternative proof of the same formula proceeds along the following outline: 
Ree es = a X gee YT 
Xe Sv eee ee x 
N=-1 
(1 — X) * raoX*® = 14+ 0 +--+ QO: -= XN 


whence dividing both sides by (1 — X) yields 


— 


yx 


Yet such a proof also requires induction to rearrange the terms in the subtraction. 
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4.126 Example. Consider the finite sequence S : 9 > Q defined by Sx := (2/3): 


8 = 2444, 8416, 324 64 4 128 256 
Vx=0 Sk el +3 tataat gt a3 + 7354 re) + 6561 


19.171) 9683 
1, 


3/ x 19171/9683 


19171 6561. 


4.6.4 Order on the Rational Numbers 


The determination of whether two rational numbers P/Q and M/N coincide can 
utilize equivalent fractions with a common denominator, for instance, (P * N)/(Q * 
N) and (Q * M)/(Q * N), and then with the comparison of the numerators P * N and 
Q * M. The same comparison leads to a concept of order on the rational numbers. 


4.127 Definition. Define a relation < on Q as follows. First, 0 < (P/Q) if and only 
if 0 < P x Q, so that either P and Q are both positive, or P and Q are both negative: 


[0 < (P/Q)] = [0 < (P * Q)]. 
Second, (J/J) < (P/Q) if and only if 0 < [(P/Q) — (1/J)]: 


[U/J) < (P/Q)] {0 < [(P/Q) — U/I}. 


The following theorem shows that the square of any nonzero rational number, 
and the sum and product of positive rational numbers, are positive rational numbers. 


4.128 Theorem. [f (M/N) > 0 and (I/J) > 0, then (M/N) + (I/J) > 0 and 
(M/N) * (I/J) > 0. Moreover, (K/L) * (K/L) > 0 for every (K/L) 4 (0/1). 


Proof. For the square, let P/O := (K/L)* = (K’)/(L’). Then P * O = (K’) * 
(L?) = (K * L)? > 0 (theorem 4.88). Thus (K/L)? = P/Q > 0 (definition 4.127). 

For the product, let P/Q := (M/N) *(I/J) = (M*I)/(N*J). By the hypotheses, 
MxN > Oand/ «J > 0. Hence Px Q = (MxI) * (N*J) = (MxN) x (Ix J) > 0, 
whence (M/N) « (I/J) = P/Q > 0 (definition 4.127). For the sum, let 
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Po Mo (M * J) +(N * ID) 
QO N J NxJ , 


Then 

P*xQ=([(MxJ)+ (ND *(N*J) = (MXN) & (J * J) + (NN) & (TJ) > 0. 

Indeed, J? > 0 and N? > 0 (theorem 4.88), (M *N) * (J*) > O and (N7) * (I*J) > 0 

by hypothesis, whence P * Q = (M * N) * (J*) + (N*) * (I * J) > 0. Oo 
The next theorem shows that < is a strict total order (definition 3.187) on Q. 

4.129 Theorem. The relation < is a strict total order on the rational numbers. 


Proof. This proof verifies that < is connected, irreflexive, and transitive. 


Irreflexivity 


(P/Q) & (P/Q) because 0 4 0 = (P/Q) — (P/Q). 


Connectedness 


If (M/N) 4 (P/Q), then P * N #4 Q * M whence (P * N — Q x* M) # 0, whence 
(Q*N)*(P*N—QxM) # 0, and then either (Q*N) *(P*N—Q*M) < 0, in which 
case (P/Q)—(M/N) < (0/1), so that (P/Q) < (M/N), or (0*N)*(P*N—Q*M) > 
0, in which case (P/Q) — (M/N) > (0/1), so that (P/Q) > (M/N). 


Transitivity 


If U/J) < (K/L) and (K/L) < (M/N), then (K/L) — (//J) > 0 and (M/N) — 
(K/L) > 0, whence (M/N) — (I/J) = [(M/N) — (K/L)] + [(K/L) — /J)] > 0 
(theorem 4.128) and then (M/N) > (I/J) (definition 4.127). Oo 


4.130 Definition. The sets of nonzero rationals (Q*), negative rationals (Q*), 
positive rationals (Q*_), non-positive rationals (Q_), and nonnegative rationals 


(Q+) are 


Q* := Q\ {0/1}, 

Q- := {P/O €Q: P/Q < 0/1}, 
Qs := {P/Q €Q: P/Q = 0/1}, 
Q* := {P/Q <Q: P/Q < 0/1}, 
Qi := {P/Q <Q: P/Q > 0/l}. 
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4.131 Definition. Define the absolute value |P/Q| of a rational number P/Q by 


P/Q if P/Q=0, 


= —(P/Q) if P/O <0. 


4.132 Theorem (Triangle Inequality). For all K/L € QandM/N €Q 


Be late 
L N| IL N 


with equality if and only if (K/L)(M/N) = 0. Also, 
K M . K M 
L N||_ |L ON 


with equality if and only if (K/L)(M/N) = 0. 


Proof. Apply the definition of the absolute value to four cases. oO 


4.133 Theorem (Archimedean Property of the Rationals). For each rational 
P/Q € Q there exists a natural number N € N such that P/Q < N/1. 


Proof, If P/Q < 0, let N := 1. If P/Q > 0, let N := |P| + 1; then 


N_P_N_ IPI _ t0Pl +1) *lOll-@ * IPD} Lg 
1 @ 1 {Ql 1 *|Q| 


because [(|P| + 1) * [Ql] — (1 * |PI) = |P| * (Ql — 1) + 1Q| = |Q| > 0. Oo 


4.6.5 Exercises on Rational Numbers 


4.91. Calculate (2/3) + (7/5). 
4.92. Calculate (5/2) + (1/7). 
4.93. Calculate (7/3) — (2/5). 
4.94. Calculate (1/2) — (1/3). 
4.95. Calculate (2/3) * (7/5). 
4.96. Calculate (5/2) * (1/7). 
4.97. Calculate (2/3) + (7/5). 
4.98 . Calculate (5/2) + (1/7). 


4.99 . Prove that on Q the division does not commute. 


www.pdfgrip.com 


244 4 Mathematical Induction: Definitions and Proofs by Induction 


4.100. Prove that on Q division is not associative. 
4.101. Prove that on Q division does not distribute over addition. 
4.102. Prove that on Q division does not distribute over multiplication. 


4.103. Prove that on Q addition does not distribute over division. 


4.104. Prove that on Q multiplication does not distribute over division. 
4.105. Prove that if 0 < ([/J) and 0 < (P/Q), then 0 < [(7/J) + (P/Q)]. 
4.106. Prove that if 0 < (//J) and 0 < (P/Q), then 0 < [(I/J) * (P/Q). 
4.107. Prove that if Q > 0 and R > 0, then P/(Q/R) = (P * R)/Q. 


4.108 . Prove that for each P/Q € Q there exists a smallest N € N* such that there 
exists M € Z with P/Q = M/N. 


4.109. Prove that for each K/L € Q4 there exists a smallest N € N with K/L < 
N/1. 


4.110. Find rational numbers K/L and M/N with (K/L)* + (M/N)? = 1. 


4.7 Finite Cardinality 


4.7.1 Equal Cardinalities 


The adjective “cardinal” means “principal” or “of greatest importance” . In the 
context of sets, the cardinal feature of sets is their size. One way to define the “size” 
of a set consists in establishing a correspondence with another set of known size, 
for instance, a natural number, as in figure 4.1. Such a natural number is then the 
“number” of elements in the set, and the correspondence amounts to an operation 
of counting. Thus the natural numbers constitute the “standard” sizes with which to 
“count” sets. More generally, two sets have the same cardinality if and only if there 
exists a bijection between those two sets. 


4.134 Definition. For all sets A and B, the sets A and B have the same cardinality 
if and only if there exists a bijection F : A — B, a situation denoted by 


ANB, 
A=B, 
|A| = |B 

#(A) = #(B), 


card (A) = card (B). 


Definition 4.134 does not yet define the concept of cardinality; it only defines 
the concept of same cardinality. Because such a definition leaves the notation #(A) 
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NASA Photo ID: 
STScl-PR94-17 


NASA Photo ID: 
P-41508 


Fig. 4.1 Same cardinality: Earth and Moon, or Pluto and Charon, or {@ 7) }}. 


yet undefined, this exposition adopts the notation A © B, which merely means that 
there exists a bijection from A to B. 


4.135 Example. All empty sets have the same cardinality. Indeed, by the axiom of 
extensionality there exists only one empty set, namely @, and the empty function 
@: @ — OS isa bijection, whence @ & @. 


4.136 Example. All singletons have the same cardinality. Indeed, for all sets X and 
Y and all singletons {X} and {Y}, the function F : {X} — {Y} defined by F := 
{(X, Y)} is a bijection. Thus {X} & {Y}. 

4.137 Example. The sets A := {4,9} and B := {2,3} have the same cardinality, 
thanks to the bijection F : A > B with F := {(4, 2), (9, 3)}. The other bijection, 
G := {(4,3),(9,2)}, could also serve to prove that A and B have the same 
cardinality. 


The following theorem forms the basis for the relation between the addition of 
natural numbers and the union of disjoint sets. 


4.138 Theorem. For all sets A, B, C, and D, if 


ARC, 
Bw D, 
ANB= @, 
COND=2@, 


then 


(AUB) © (CUD). 
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Proof. The hypotheses A ® C and B © D mean that there exist bijections 


F:A>C, 
G:B—-D. 


Such bijections lead to a bijection 
H: (AUB) — (CUD) 
defined by 
H := (FUG) © (AUB) x (CUD) 
so that 


Y = F(X) if X €A, 
(XYeH]<S 
Y = G(X) ifX eB. 


The relation H just defined is a function, because for each X € (AUB) the relation 
H contains only one pair (X, Y). Indeed, thanks to AM B = @, either X € A and 
then Y = F(X), or X € Band then Y = G(X), but not both (definition 3.94). 

To verify the injectivity of H, assume that W € (AUB) and X € (AUB) have the 
same image H(W) = Y = H(X) in C UD. Because of CN D = @, either both 
images lie in C or both lie in D. In the first case, if both images lie in C, then W € A 
and X € A, and then 


F(W) = H(W) = H(X) = F(X), 


whence W = X by injectivity of F. In the second case, if both images lie in D, then 
W € Band X € B, and then 


G(W) = H(W) = A(X) = G(X), 
whence W = X by injectivity of G. To verify the surjectivity of H, assume that 
Z € (CUD). Then either Z € C or Z € D. In the first case, if Z € C, then the 
surjectivity of F guarantees the existence of an element W € A such that 


Z = F(W) = H(W). 


In the second case, if Z € D, then the surjectivity of G guarantees the existence of 
an element X € B such that 


Z = G(X) = H(X). 


Therefore, H : (AUB) — (CUD) is a bijection. Oo 
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The following theorem forms the basis for the relation between multiplication of 
natural numbers and Cartesian products of sets. 


4.139 Theorem. For all sets A, B, C, and D, if 


> 


ARC 
Bw D, 


RV 2 


then 
(A x B) & (Cx D). 
Proof. The hypotheses A ® C and B © D mean that there exist bijections 


F:A>C, 
G:B—-D. 


Such bijections lead to a bijection H defined by 


H: (Ax B) > (CxD), 
(W,X) > (F(W), G(X)). 


The relation H is a function, because F and G are functions, so that foreach W € A 
and each X ¢€ B there exists at most one Y € C and at most one Z € D with 
(W,Y) € F and (X,Z) € G, so that there exists at most one (Y,Z) € C x D with 
((W,X), (Y,Z)) € H. To verify the injectivity of H, assume that (W,X) € (A x B) 
and (U,V) € (A x B) have the same image H(W, X) = H(U, V) in C x D: 


H(W,X) = H(U, V) hypothesis, 

¢ definition of H, 
(F(W), G(X)) = (F(U), G(V)) 
¢ equality of pairs, 
[F(W) = F(U)] A [G(X) = G(V)] 
¢ injectivity of F and G, 
[W=U]A[X=V] 
¢ equality of pairs. 
(W,X) = (U,V) 


To verify the surjectivity of H, assume that (R,S) € (C x D). Then the surjectivity 
of F guarantees the existence of an element W € A such that 


R= F(W), 
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and the surjectivity of G guarantees the existence of an element X € B such that 


S = G(X). 
Consequently, 
(R, S) = (F(W), G(X)) = H(W,X). 
Therefore, H : (A x B) — (C x D) is a bijection. oO 


4.7.2 Finite Sets 


The following definition establishes the concept of cardinality for finite sets. 


4.140 Definition. For each set S, the set S is finite if and only if there exists N ¢ N 
and a bijection F : N — S. Such a natural number N is then called the number of 


elements in S, or the cardinality of S, which is denoted by #(S), |S], or S. 


4.141 Example. For each natural number N ¢€ N the set N is finite and has N 
elements, because the identity function Iy : N — N, K + K isa bijection. 


4.142 Remark. Because every bijection has an inverse function, a set S is finite if 
and only if there exist a natural number N € N and a bijection G: S — N, for 
instance, the inverse function G := F°~! for any bijection F: N > S. 


The following theorem shows that the insertion of a new element into a set 
corresponds to the arithmetic addition of 1 to its cardinality. 


4.143 Theorem. The equality #(AU{Z}) = [#(A)] + 1 holds for each finite set A, 
and for each set Z ¢ A. 


Proof. For each finite set A there is a natural number N and a bijection F : N > A. 
Moreover, for each set Z there exists a bijection of singletons G: {N} > {Z}. 
Consequently, because A M {Z} = @ by hypothesis, and because NM {N} = @ 

by theorem 4.42, it follows that theorem 4.138 gives a bijection N + 1 > AU{Z}: 


H = (FUG): N+ 1 = (NU{N}) > (AU{Z}). 


oO 


The following theorem shows that the cardinality of the union of two disjoint 
finite sets equals the arithmetic sum of their two cardinalities. 


4.144 Theorem. The equality #(AUB) = [#(A)]+ [#(B)] holds for all disjoint finite 
sets A and B. 
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Proof. This proof proceeds by induction with the cardinality of the second set. 
If #(B) = 0, then B = @ by definition, whence for each finite set A, 


#(AUB) = #(AU@) = #(A) = [#(A)] + 0 = [#(A)] + [H@)]. 


Hence, assume that there exists a natural number N € N for which the theorem 
holds, so that the equality (AUB) = [#(A)] + [#(B)] holds for all disjoint finite sets 
A and B with #(B) = N. For each set C with N + | elements, there exists a bijection 
F:N+41-— C. Consequently, the subset B := F"(N) = F'"({0,...,N — 1}) has 
N elements, because the restriction F|y : N — B is a bijection. Hence, with the 
element Z := F(N), it follows that C = BU{Z} with Z ¢ B, whence 


#(AUC) 
= because C = BU{Z}, 
#(AU[BU{Z}]) 
= associativity of U, 
#[(AUB)U{Z}] 


= theorem 4.143, 
#(AUB) + #({Z}) 
= induction hypothesis, 
[#(A) + #(B)] + #CZ}) 
= associativity of +, 
#(A) + [#(B) + #((Z})] 
= theorem 4.143, 
#(A) + #(C). 


oO 
The following two theorems confirm that every subset of a finite set is also finite. 


4.145 Theorem. For each N <N, every subset S C N is also a finite set, with at 
most N elements. Moreover, each proper subset S C N has fewer than N elements. 


Proof. This proof proceeds by induction with N. 

If N := 0, then N = @, and the only subset S C N is § = @, which is finite. 

As an induction hypothesis, assume that there exists a natural number K € N 
such that the theorem holds for N := K, so that each subset S$ C K is finite with 
at most K elements, and that each proper subset $ C K has fewer than K elements. 
Hence, consider a subset R C K + 1. Two cases arise: either K ¢ Ror K € R. 

If K ¢ R, then R C [(K + 1) \ {K}] = K whence R is finite with at most K 
elements by induction hypothesis. 

If K € R, then the set C := R \ {K} is a subset of [(K + 1) \ {K}] = K, whence 
C is finite with at most K elements by induction hypothesis. Thus, there exists a 
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natural number L < K such that C = R \ {K} has cardinality L < K, and L = K 
if and only if C = K. Hence, theorem 4.143 shows that R = CU{K} is finite with 
cardinality L+1<kK+1,andL+1=K+1ifandonlyifR=K+1. Oo 


4.146 Theorem. Every subset of a finite set is also finite, with at most as many 
elements. 


Proof. For each finite set A there exists a natural number N € N and a bijection 
F: A — N. Hence, for each subset B C A, the restriction F|p : B > Nisa 
bijection from B onto a subset S := F'"(B) C N. Because every such subset § C N 
is also finite with at most NV elements, it follows that there exists a natural number 
L < Nanda bijection G: S > L. Consequently, the composition Go F|z establishes 
a bijection from B onto L, which means that B is finite with L < N elements. oO 


4.147 Theorem. The equality #(A \ B) = [#(A)] — [#(B)] holds for every subset 
B CA of every finite set A. 


Proof. The result follows from the disjointness of B and A \ B, and from theo- 
rem 4.146, which ensures that both B and A \ B are finite: 


BN(A\B) =2, 
BU(A\B) =A, 
[#(B)] + [#(A \ B)] = #(A), 


whence #(A \ B) = [#(A)] — [#(B)] by definition (4.59) of subtraction. Oo 


Theorem 4.145 shows that for each subset S C N there is a bijection F: S > L 
with L < N, but it does not yet prevent the existence of other bijections G: S$ > L 
with L > N. The following theorem confirms that there is no such bijection. 


4.148 Theorem. Forall K,N € N with K < N, there are no injections F: N > K. 


Proof. This proof uses induction with N. For N := 1, the only smaller natural 
number is K := 0, and there exists no function F : {1} — @, whence no injection 
either. 

As an induction hypothesis, assume that there exists a positive integer L € N* 
such that the theorem holds for VN := L, so that for every natural number K < L 
there are no injections F : L — K. The proof that for every K < L + 1 there 
are no injections F : (L + 1) — K proceeds by contraposition. Thus, assume that 
there is such an injection F: (L+ 1) > K. Let Z := F(L) and S := K \ {Z}. By 
theorem 4.147, Sis a finite set with K — 1 elements, and there is a bijection G: S > 
(K — 1). Then the restriction Fl; : L — S is an injection, and the composition 
GoF|,: L > (K — 1) isan injection. Hence K — 1 > L by induction hypotheses, 
whence K > L+ 1. oO 


The following theorem shows the equivalence of the concepts of injection, 
surjection, and bijection between sets with the same finite cardinality. 
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4.149 Theorem. For all sets A and B with the same finite cardinality, and for each 
function F : A — B with domain G(F) = A, the following conditions are mutually 
equivalent: 


(P)  F is injective, 
(Q)  F is surjective, 
(R)  F is bijective. 


Proof. If F is bijective, then F is also injective and surjective, because (R) <> [(P)A 
(Q)]| by definition of bijectivity (definition 3.118); therefore both (R) = (P) and 
(R) => (Q) hold, by theorems 1.53 and 1.54. 

For the reverse implications, because A and B have the same finite cardinality, 
there exist a natural number N € N and a bijection G: N > A. 

If F is injective, then F"(A) C B is a finite set with cardinality L < N, so that 
there exists a bijection H: F"(A) — L. Then the composition 


HoFoG:NSA5F"(A)3L 


is an injection, whence L > N by theorem 4.148. From L < N and WN < Lit follows 
that L = N. Hence H, G, and (Ho FoG): N > N are bijections, whence F is also 
surjective, whence bijective, which proves the implication (P) => (Q); therefore 
(P) > [(P) A (Q)] holds (by theorem 1.82), whence also (P) => (R). 

The proof of the converse, (Q) => (P), uses the contraposition [=(P)] => [-=(Q)]. 
If F is not injective, then A contains two distinct elements X # Z such that F(X) = 
F(Z). Let S := A \ {Z}, so that F"(A) = F'"(S) and there exists a bijection J: (VN — 
1) — S, by theorem 4.147. Also, there is a bijection 7: B — N, because A and B 
have the same cardinality, N. Then the composition 


IoFoJ: (N—1) 5S FS) >N 
cannot be a bijection, because by theorem 4.148 its inverse could not be an injection 


from N to N — 1. Hence F cannot be surjective, which proves (Q) => (P). Oo 


The following theorem shows that the number of elements in a Cartesian product 
equals the product of the numbers of elements in its factors. 


4.150 Theorem. For all K,L € N, the Cartesian product K x L has K * L elements. 
Proof. By induction with L, if L = 0, then L = @, whence for every K ¢ N 


Kx0O=Kx@=@, 
#(K x 0) = #(@) =O=K * 0. 


As induction hypothesis, assume that there exists a natural number M € N such 
that the theorem holds for L := M, so that the equality 


#(K x M) = K*M 


252 
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holds for every K € N. Hence, from the disjoint union M + 1 = MU{M}, and from 
the distributivity of Cartesian products over unions (theorem 3.63), it follows that 


K x (M+ 1) = K x (MU{M}) 
= (K x M)U(K x {M}), 


#[K x (M + 1)] = #[(K x M)U(K x {M})] 
= [#(K x M)] + [#(K x {M})] 
=(K*M)+K 
=Kx«x(M+1), 


thanks to the disjoint union (K x M)U(K x {M}). Oo 


4.7.3 


4.111. 
4.112. 
4.113. 
4.114. 


4.115. 


4.116. 


4.117. 
4.118. 
4.119. 
4.120. 


Exercises on Finite Sets 


Determine #(@). 

Determine #[A(@)]. 
Determine #(P[A(2))]). 
Determine #| A(A[A(2)))]. 


Determine #( P| P(ALP@))]). 


Determine #| 2 ({2. {D}, {2, (o3}t)]. 

Prove or disprove that all ordered pairs have the same cardinality. 

For all finite sets A and B, prove that (A x B) & (B x A). 

For all finite sets A, B, C, prove that [(A x B) x C] & [A x (Bx C)]. 
For all finite sets A and B, prove [#(AAB)] = [#(A U B)] — [H(A N B)]. 


4.8 Infinite Cardinality 


4.8.1 


Infinite Sets 


Some sets are not finite, for they do not admit any bijection onto any natural number. 


4.151 Definition (infinite sets). A set Z is infinite if and only if Z is not finite, 
which means that there are no bijections from Z onto any natural number. 


For instance, the set N is infinite. 
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4.152 Theorem. The set N of all natural numbers is not finite. 


Proof. For each natural number N and for each function F : N > N, the restriction 

F\y+1 (N + 1) > N cannot be injective, by theorem 4.148, whence F cannot be 

injective. Therefore, there are no such bijections, which means that N is not finite. 
oO 


As there exist finite sets with different cardinalities, there also exist infinite sets 
with different cardinalities. For instance, the following considerations lead to infinite 
sets with cardinalities different from the cardinality of the natural numbers. 


4.153 Definition. For all sets X, Y, let Y¥ denote the set of all functions from X to 
Y, with domain X. 


4.154 Example. If X := @, then 2* = {@}, because @ : @ — {0, 1} is the only 
function from X = @ to 2 = {0, 1}. 


4.155 Example. If X := {S} is a singleton, then 2* consists of two elements, 
because there are exactly two functions from X = {S} to 2 = {0, 1}: 


F: {S} > {0, 1}, 
Str 0; 

G: {S} + {0, 1}, 
Srl. 


Thus, 2' = {F, G} has exactly two elements, F and G. 
The next theorem shows that each set X is “strictly smaller” than 2*. 


4.156 Theorem. For each set X, there exists an injection from X to 2*. Yet there 
does not exist any surjection from X to 2*. 


Proof. To establish the existence of an injection X <> 2*, consider the function 
J: X < 2* defined as follows. The function J maps each element N € X toa 
function J(N) : X — 2 specified by 


lifkK =N, 
VON =) giex en. 
In other words, the function J(N) is the characteristic function 74); of the singleton 
{N} (from example 3.89). Consequently, J is injective; indeed if M # N, then 
[J(M)|(M) = 1 but [J(V)](M) = 0, whence J(M) 4 J(N). 

To prove the absence of any surjection X —» 2*, for each function J: X > 2*, 
this proof demonstrates a method known as Cantor’s diagonalization to show that 
J is not surjective. Each such function J : X — 2* maps each element N € X toa 
function J(N) € 2*, so that J(N) : X — 2. In particular, J(N) maps N to an element 
[J(N)|(N) € 2 = {0, 1}. Thus, define a function F : X > 2 by 
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0 if V(N)|() = 1, 
F(N) := 
() 1 if [V(NY)](V) = 0. 

Thus, F(V) 4 [J(N)](N) for every N € X, whence F 4 [J(N)] for every N € X. 
Consequently F ¢ J"(X), whence J is not surjective. Oo 


4.157 Example. One among several methods to define the set R of all real numbers 
consists in defining R as a set of infinite sequences of digits [118, p. 565-566]. Thus 
the set of all real numbers between 0 and | can be defined as the set of all sequences 
R « 3% (subject to the constraint that there does not exist any K € N such that 
R(N) = 2 for every N > K). Then every function J: N — R fails to be surjective. 
Indeed, as in the proof of theorem 4.156, for each function J: N — R define a 
function G: N > R by 


1 if J(N)|(N) = 2, 
G(N) := 3 0 if J(N)|(N) = 1. 
1 if [J(N)|(N) = 0. 


Thus, G € R but G(N) F [J(N)](N) for every N € N, whence G # [/(N)] for every 
N €N. Consequently G ¢ J"(N), and therefore J is not surjective. Hence there does 
not exist any bijection J: N > R. 


Example 4.157 reveals that the set R of all real numbers is “more infinite” than 
the set N of all natural numbers. Moreover, applying theorem 4.156 to X := R 
shows that 2 is also “more infinite” than R. Then 22°) is also “more infinite” than 
28. And so forth, thus there exists an “infinite” variety of “infinite” sets. 


4.8.2 Denumerable Sets 


There exist several infinite sets that have the same cardinality as N has. For instance, 
using only addition and multiplication from integer arithmetic, this subsection 
presents a proof that the set of all nonnegative integers N and the set of all integers 
Z have the same cardinality; similarly, N and the Cartesian product N x N have the 
same cardinality. The following terminology conforms to [8, p. 152], [30, p. 47], and 
[128, p. 151]. 


4.158 Definition. A set is denumerable — or has cardinality No (read “aleph 
zero”) — if and only if it has the same cardinality as the set N of all natural numbers. 
A set is countable if and only if it is either finite or denumerable. 


The following theorem shows that the set Z of all the integers has the same 
cardinality as the set N of all the nonnegative integers. 


www.pdfgrip.com 


4.8 Infinite Cardinality 255 


4.159 Theorem. The sets N and Z have the same cardinality. 
Proof. Define F: N > Z by 


F(N) := N/2 if there exists K € N with N = 2 * K, 
—(N + 1)/2 if there exists K € N withN+1=2xK. 

The function F is surjective. Indeed, if L €¢ N, then L = F(2 * L). Similarly, if 
LeZ\N, thenL = F([2 * L] + 1). 

The function F is also injective. Indeed, if F(M) = F(N), then either both or 
neither of F(M) and F(N) are elements of N. If F(M) € N and F(N) € N, then 
M/2 = F(M) = F(N) = N/2, whence M = N. If F(M) ¢ N and F(N) €N, then 
—(M + 1)/2 = F(M) = F(N) = —(N + 1)/2, whence M = N. Oo 


4.160 Theorem. For all disjoint denumerable sets A and B, AUB is denumerable. 


Proof. By the hypotheses there exist bijections /: A > N andJ: B > N. The 
function K: N > Z with K(N) := —(N+1) is injective, and hence the composition 
K oJ is also injective. Therefore, the function G := (K 0 J)UJ is a bijection from 
AUB to Z. Hence F°-!0 G: AUB > Nisa bijection, with F as in theorem 4.159. 
oO 


To facilitate the proof that N and the Cartesian product N x N have the same 
cardinality, the following definition specifies an inductive method to define and 
compute the sum 1 + 2+.---+ (N—1)+/N, known as an arithmetic series. 


4.161 Definition. Define a function T: N > N by 
T(0) := 0, 
TN+):=TN)+ (V4 1). 
Also, define the notation 0 + 1+---+N:=T(N). 
Thus, 
T(0) := 0, 
T(1) := T0) + (0+ 1) = 04+ (04+ 1) = 1, 


TQ) :=T1)+(+1 =14+(014+) =3, 
T(3) := T2)+ (2+1)=34+(24+1) =6, 


The values of 7 are called triangular numbers because they correspond to the 
number of elements in the following patterns: 


www.pdfgrip.com 


256 4 Mathematical Induction: Definitions and Proofs by Induction 
T(0) T(1) T(2) T(3) 
e ° e 
e e e e 
e ® e 


The following theorem provides a different formula to compute the same function 
T. 


4.162 Theorem (arithmetic series). For each natural number N € N, 


N «(N+ 1) 


O+142+---+W-D+N=TW) = 5) 


Proof. This proof uses induction with N. If N := 0, then T(0) = 0 = 0« (0+ 1)/2. 
Assume that there exists M € N such that the theorem holds for N := M, so that 
T(M) € Nand2*«7T(M) = Mx (M+1). ThenT(M+1)=T(M)+(MV+) EN, 
and 


2*7T(M+1)=2«[T(M)+ (M+ 1)] 
= [2* T(M)| + [2* (M+ 1)] 
=Mx*(M+1)+2* (M+ 1) 
= (M+ 2)*(M@+1) 
= (M+ 1)*[(M+1)4+ 1]. 


Consequently, 2 * T(M + 1) = (M+ 1) *[(M@+1) + 1], but TV + 1) € N, whence 
(M+ 1) *[(M+1)4+ 1]/2e¢Nand7(M+ 1) = (M+ 1)*[(M+4+1)+1]/2. O 
An alternative proof of the same formula proceeds along the following outline: 
T(N) = O + lL +--+ (W-1)+ N, 


T(N) N +(N-1)+---+ 1 + = 0, 
T(N) + T(N) = (N+ 1) 4+ (N41) 4+-:-+(N+1)4+ (N41), 


whence 2 * T(N) = N x (N + 1). However, a proof along this outline also requires 

induction to rearrange the terms of the sum with associativity and commutativity. 
The following definition provides a formula for Cantor’s diagonal enumeration 

of N x N, and the subsequent theorems will verify that indeed it enumerates N x N. 


4.163 Definition. Define a function 7 : (N x N) > N by 
ZI (M,N) :=M+T(M+N) 


M+N M+N+1 
7 + eas +1) 
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The value .7(M,N) corresponds to the sum of the number of elements in the 
triangular pattern counted by the “triangular number” T(M + JN) and a last partial 
row with M elements (instead of a complete last row of M+ N elements for the next 
triangular number). For example, with M := 1 and N := 2, 


TF (1,2) 
T(M +N) e e 


M ° 
The following theorem shows that the function Y : (N x N) > N is surjective. 
4.164 Theorem. For each I € N there exist M € N and N € N such that 


M+WN M+N+1 
jg OD) 


T= 


Proof. This proof proceeds by induction with I. 

First, if J := 0, then 7 = 0+ T(0 + 0) with M := O and N := 0. 

Second, assume that there exists K € N such that the theorem holds for J := K, 
so that there exist M,N € N with K = M+ 7(M+N). 

If N > 0, thenK+1=1+[M+T(M+N)] = (M@+1)+7T((M+4+1]4+[N-1)). 

If N = 0, then kK +1=14+([(M+7M+0)] = (MW4+1)4+7TM+4+0) = 
0+ 7(0+ [M + 1)). oO 


The following theorem shows that the function 7 : (N x N) > N is injective. 


4.165 Theorem. For all K,L,M,N €N, ifK +T(K +L) =M+T(M +N), then 
both K = MandL=N. 


Proof. If 
K+T(K+L)=M+T(M+N), 
then subtracting M and T(K + L) from both sides gives 
K-M=T(M+N)-T(K+L). 
If K > M, then K — M > 0, so that the left-hand side is positive, but then the right- 
hand side must also be positive: T(M+N)—T(K+L) > 0.HenceM+N > K+L, 


but then 


T(M+N)—-T(K+L)>T(M+N)—-TM+N-1)=M+N 
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by definition of 7. Thus 
K-M=T(M+N)-T(K+L)>M+N 
whence K > (2* M) +N. WithM +N > K +L, this gives 
M+N>K+L>[2*M)+N]4+L 
whence subtracting (2 * M) + N from all sides gives 
—M > L, 


which contradicts the hypothesis that L > 0. oO 
The following theorem confirms that N x N is denumerable. 

4.166 Theorem. The function Z : (N x N) > N is bijective. 

Proof. The bijectivity results from theorems 4.165 and 4.164. Oo 


The computation of the inverse function .7~': N — N x N can proceed 
according to the straightforward algorithm provided by the proof of theorem 4.164: 
for each J € N, observe that T(0) = 0 < J, and compute T(0), T(1),..., T(Z) until 
T(L—1) <1 < T(L). Then let M := J—T(L—1) andN := (L— 1) —M. 


4.8.3 The Bernstein—Cantor—Schroder Theorem 


The following theorem guarantees the existence of a bijection between two sets, 
provided that there exist injections from one set to the other and vice versa. Accord- 
ing to Suppes [128, p. 95], Cantor conjectured the theorem and then Bernstein and 
Schréder proved it independently of each other in the 1890s. Fraenkel [37, p. 102] 
credits the following proof to J. M. Whitaker. 


4.167 Theorem (Bernstein—Cantor-Schroéder). For all sets A and B, if there are 
injections F: A —> BandG: B— A, then there is a bijection H: A —> B. 


Proof. The strategy of this proof consists in producing a subset E C A such that 
G"[B\ F"(E)] = (A\ E) 

and then in setting 
H := (F\g)U (G?"'|ayz)- 


To this end, define 
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D={C CA: G"B\ F"(O)] CS (A\ ©C)} 
= {CCA: CCA\G'"B\ F"(O)}}: 


the proof will verify that (|) Z satisfies the requirements. 
First, for all subsets V, W C A, if V C W, then F"(V) C F"(W), whence [B \ 
F"(W)] ¢ [B \ F"(V)], and hence G"|[B \ F"(W)] € G"[B \ F'"(V)], so that 
{A \ G"[B\ F"(V)]} C tA \ G"[B\ F"(W)]f. 


In particular, if V €¢ Y, then V C A \ G"[B \ F"(V)] by definition of F, and 
VCW:= [| Y by definition of J F; consequently 


VC {A\ G"B\ F"(V)]} € \4 \G" [2 \ F" (U 2) |\ 


Because these inclusions hold for every element V € J, it follows that they also 
hold for their union: 


Lac {4\o"[B\F"(U 2). 
E:=A\G"|B\ F"(()9)]. 


From |) Y C E it follows that 


A\G" [2 \ F" (U 2)| C {A\ G"[B\ F"(6)}}. 
so that 
EC {A\ G"[B\ F"(E)]} 


whence E € Y. Consequently, E C |) F, whence E = \J Y, but then the definition 
of E gives 


E=A\G"[B\ F"(B)]. 


The next theorem shows a use of the Bernstein—Cantor—Schréder Theorem. 
4.168 Theorem. There exists a bijectionH: Z > Q. 


Proof. First, there exists an injection F : Z > Q with F(N) := N/1. 
Second, there exists an injection J: Q — (Z x Z) such that (P/Q) := (P,Q) 
with P € N minimum (theorem 4.99), 
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Also, there exists an injection J : (Z x Z) — Z, by theorem 4.166. 

Consequently, the composition G := J o / is an injection Q — Z. 

Therefore, the Bernstein—Cantor—Schroder Theorem guarantees the existence of 
abijectionH: Z>Q. Oo 


Theorem 4.169 shows that every infinite subset of a denumerable set is denumer- 
able. Hence every subset of a countable set is also countable. 


4.169 Theorem. For each infinite subset S CN, there exists an injection H : N > 
S. Hence S is denumerable. 


Proof. Define G: A(S) > A(S) by G(S) := S and G(B) := BU {min(S \ B)} for 
every BZ S. 

Define F: N > A(S) by F(O) := {min(S)}, and F(N + 1) := G[F(N)]. 
Thus F'(N + 1) inserts the smallest element of S that has not yet been included by 
F(0),..., F(N). 

Let H(M) := max[F(M)]. 

Also, the inclusion function 1: S — N defined in example 3.93 by 1(X) = X is 
injective by example 3.121. 

Hence there exists a bijection between S and N, by the Cantor—Bernstein— 
Schroder Theorem (4.167). oO 


Theorem 4.170 shows that every subset of the range of a function defined on a 
countable domain is also countable. 


4.170 Theorem. For each set E, if there exists a surjection Q: N — E, then E is 
countable. 


Proof. If E is finite, then E is countable, by definition. 

If E is not finite, then define an injection G: E — N by G(X) := 
min[Q°—!" ({X})], so that G(X) is the smallest natural number mapped to X by 
Q. The function G is well-defined, because Q is surjective, so that the pre-image 
Q°'"({x!) # @ contains a unique smallest element. The function G is also 
injective; indeed, if X # Y, then Q°-!"({X}) N Q°"!"({Y}) = @, because Q is a 
function. Consequently, the image S := G"(E) C N is an infinite subset of N. 

By theorem 4.169 and the Bernstein—Cantor—Schréder Theorem (4.167), it 
follows that S and hence E are denumerable. Oo 


4.8.4 Denumerability of all Finite Sequences 
of Natural Numbers 


This subsection shows that the set of all finite sequences of natural numbers is 
denumerable. 

Theorem 4.171 shows that allowing the number of variables to change from one 
function to another still produces a set of functions. 
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4.171 Theorem. For all sets E and C, all the functions F : EX — C of any finite 
number of variables from the domain E and with values in the co-domain C form a 
set 


er -Ke nt 


Proof. LetA := N, B:= E, and J := N. Thus 9=NC A(N) = A(A) because 
K CN foreach K €N, by theorem 4.20. Then, with the notation of theorem 3.136, 


Fup i= {EX : K €N} 


is the set of all finite sequences of elements of E. 

Next, again with the notation of theorem 3.136, let A := U-Ay.z. Then EX € 
Fx ~ whence EX C A for each K € N, by theorem 3.40, and hence Ay,¢ C P(A). 
Also let Z := Fy. and B := C, the set C in the hypothesis of the theorem. Thus 
C®") is the set of all functions F : EX + C defined on all of EX, with EX € J for 
each K € N, and 


Foci= cnr :Ke nt 


is the set of all functions defined on any E* into C. 


4.172 Theorem. For each K € N*, the set N* of all finite natural sequences of 
length K is denumerable. 


Proof. Following [86, p. 799], this proof proceeds by induction with K. 

For K = 1, the identity function Jy : N > N, N +» N defined in example 3.86 is 
a bijection, by example 3.120. 

For K = 2, Cantor’s diagonal enumeration 7 : NxN — N from definition 4.163 
is a bijection, by theorem 4.166. 

Denote the identity function Jy by “%, and Cantor’s diagonal enumeration .7 
by &. 

As an induction hypothesis, assume that for some K = M there exists a bijection 
Ty : NY TN. 

For the induction step from K = M to K = M +1, define %y4,: NYt! + N 
by 


Ty+i(No, gate »Nu—1;Nu) = Fa Zu (No, ion ,Nu-1), Nu. 


Equivalently, Ay4, = Ao (Fy K Ip), with K from definition 3.116, and where 
Ty is bijective by induction hypothesis, whence -%y, &X Iy is also bijective, by 
theorem 3.117, and hence the composition of the bijection .% preceded by the 
bijection %y & Iy is again bijective, by theorem 3.123. oO 
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4.173 Theorem. The set 'o NX of all finite sequences of natural numbers is 


KeEN* 
denumerable. 


Proof. For each K € N*, denote by .% : N* — N any enumeration of all natural 
sequences of length K, as in theorem 4.172. Again following [86, p. 799], this proof 
pieces together all the bijections % of N* by means of K and .%. Specifically, 
define .7 : 'e N* - N by 


KeN* 
S (No, ene ,Nr-1) = Fa| Z7x(No, feo ,Nx-1), K]. 


The map .” is injective: indeed, if “%(No,...,Nx-1) = “(Vo,...,J-1), 
then A[.Z%K(No,...,Nx-1), K] = AlAMo....,Jz-1),L], whence K = L and 
Tx(No,---;Nx-1) = A(Vo,...,Jz-1), because JF is injective by theorem 4.165. 
Consequently, Z%(No,...,Nz-1) = ALVo,..-,Jz-1), whence (No,...,Nz-1) = 
(Jo,...;Jz-1), because % is injective, by theorem 4.172. 

The map .” is also surjective: indeed, for each M € N there exists (J, K) € N? 
such that AU, K) = M, because .% is surjective by theorem 4.164. Consequently, 
there exists (No,...,Nx—1) € N* such that .%(No,...,Nx—1) = I, because Z% is 
surjective, also by theorem 4.172. Thus .“(No,...,Nx-1) = M. oO 


4.8.5 Other Infinite Sets 


The present text has defined a set to be finite if and only if there exists a bijection 
onto a natural number, and infinite if and only if there does not exist any such 
bijection. There exists a different definition of infinite sets, called “Dedekind- 
infinite” [128, p. 107], corresponding to Dedekind’s definition [25, §V, #64, p. 63]. 


4.174 Definition. A set Z is Dedekind-infinite if and only if there exists a proper 
subset Y & Z and a bijection F : Z — Y, or, equivalently, F oT YSZ. 


Thus, a set is Dedekind-infinite if and only if it has the same cardinality as that 
of one of its proper subsets. 


4.175 Example. The set N is Dedekind-infinite because there is a proper subset 


N* C N and a bijection defined by the successor function: 


G: N* —N, 
XH XU {X}. 
4.176 Theorem. Every Dedekind-infinite set is also infinite. 


Proof. This proof proceeds by contradiction. If a set Z is not infinite, in other words, 
if Z is finite, then there exist a natural number N € N and a bijection GG: N > Z. 
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If Z is also Dedekind-infinite , then there exists a proper subset Y & Z and a bijection 
F: Z— Y. Then the composition 


o—] 
H=GoreG: nSzZ5Y5 N 


would be a bijection from N onto a proper subset G°~!"(Y) & N, which would 
contradict theorem 4.148. oO 


The proof of the converse requires the Axiom of Choice (exercises 4.139, 4.140). 
4.177 Theorem. Every infinite set contains a denumerable subset. 


Proof. Apply recursion (theorem 4.21). If Z is infinite, then Z # @, whence there 
exists some X € Z. Define Fy : {0} > Z by 0 }& X. Assume that there exists 
an injection Fy : N > Z. Then Z 4 Fy"(N) because Z is infinite, whence Z \ 
Fy"(N) #4 @ and there exists X € Z \ Fy"(N). Let Hy : {N} —> {X} and let 
Fy4i:= FyUHy. Finally, let F := wen Fy. Then F: N > Zisaninjection. O 


Theorem 4.177 explains the subscript 0 in the notation Xo for the cardinality of N. 
Because in the Zermelo-Frenkel set theory with the Axiom of Choice every infinite 
set Z contains a denumerable subset, there exists an injection F : N — Z, so that 
the cardinality of N cannot exceed that of Z. If there is also an injection G: Z > N, 
then the Bernstein—Cantor—Schroder Theorem (theorem 4.167) guarantees that there 
is also a bijection H : N — Z; thus, the cardinality of Z cannot be strictly smaller 
than that of N. Thus Xo represents the “smallest” infinite cardinality. 


4.178 Theorem. Every infinite set is also Dedekind-infinite. 


Proof. If W is infinite, then W contains a denumerable subset Z C W, by 
theorem 4.177. By example 4.175, there exists a bijection F : Z — Y onto a proper 
subset Y ¢& Z. Extend F to all of W by setting H := FU (Iwlw\z)- Oo 


4.179 Theorem. Every denumerable union of disjoint denumerable sets is denu- 
merable. 


Proof. If F is denumerable, then there is a bijection A: N > ¥ with] + A(/). If 
each A(J) is denumerable, then by the Axiom of Choice there is a bijection F; : N > 
A() with J  F;(J) € A(J). Hence the function G: N x N > |) ¥ defined by 
GU, J) := Fi(/) is a bijection. Oo 


4.8.6 Further Issues in Cardinality 
4.8.6.1 Other Axioms of Infinity 
Bolzano [10] and Dedekind [25, §V, Theorem 66, p. 64] argued for the “existence” 


of an infinite set from practical considerations. Yet the “existence” of an infinite set 
does not follow from the other axioms, but requires an axiom of infinity [8, p. 21]. 
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Zermelo [145] introduced such an axiom with a set such that if X is an element, then 
{X} is also an element. The variant adopted here, with {X} replaced by X U {X}, is 
attributed to John von Neumann [135] and has proved more convenient [8, p. 22]. 
There also exist “infinitely” many other nonequivalent axioms of infinity [18, §57, 
p. 342-346]. 


4.8.6.2 Peano’s Axioms 


There is an alternative method to introduce natural numbers, published by Giuseppe 
Peano, which does not involve set theory. Instead, Peano’s system consists of the 
following axioms, stated here beginning with 0, as done in [46, §2, #II], [128, 
p. 121). 


Axiom AQ 0 is a natural number. 
Axiom Al K = L means that K and L are the same natural numbers. 


Axiom A2 For each natural number N there exists exactly one natural number 
denoted by N’ and called the successor of N. 


Axiom A3 N’ + 0 for each natural number N. 
Axiom A4 If K’ = L’, then K = L. 


Axiom A5 For each set S of natural numbers, if 


¢ OQ is an element of S, and 
* if N belongs to S, then K’ belongs to S, 


then every natural number is an element of S. 


From Peano’s axioms, and two additional axioms for recursive definitions of 
addition and multiplication [128, p. 136], the same proofs as in this chapter verify 
all the algebraic properties of arithmetic and ordering with natural numbers [76, 
Ch. 1, p. 1-18]. However, in situations that involve other topics, for examples, 
rational numbers or cardinality of sets, the use of Peano’s axioms would require 
some theoretical link between Peano’s natural numbers and other sets, in other 
words, some means of including Peano’s arithmetic and applications within the 
same framework. This theoretical link can be the development of arithmetic from 
within set theory, as done here. 


4.8.6.3 Alternative Sequence of Developments 
The progression from N to Z and then to Q allows for subtractions with integers 


without requiring rational numbers, and then allows for divisions and subtractions 
with rational numbers without requiring the development of “real” numbers. 
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An alternative development proceeds from N to Q+, then to the nonnegative “real” 
numbers R+ and finally to the “real” numbers R and the “complex” numbers C, as 
outlined by Edmund Landau [76]. Yet other constructions of the set of real numbers 
from the axioms are detailed by Michael Henle [58] and John Stillwell [120]. 


4.8.6.4 The Generalized Continuum Hypothesis 


Because there is no surjection from N to 2% (theorem 4.156), the question arises 
whether there is any set S with a cardinality between the cardinalities of N and 2. 
In other words, the question pertains to the existence of a set S for which there exist 
injections 


Ns § <> 2N 


but no injections from 2‘ back to S and no injections from S back to N. The 
hypothesis that no such set as S exists is called the continuum hypothesis. 

More generally, because for every infinite set X there does not exist any surjection 
from X to 2*, the question arises whether there exists any set S with a cardinality 
strictly between the cardinalities of X and 2*. In other words, the question pertains 
to the existence of a set S for which there exist injections 


X55 2% 


but no injections from 2* back to S and no injections from S back to X. The 
hypothesis that no such set as S exists is called the generalized continuum 
hypothesis. 

The axioms of set theory (SI-S8) are consistent with both the generalized 
continuum hypothesis (as proved by K. Gédel [44, 45]) and with the negation of the 
generalized continuum hypothesis (as proved by P. J. Cohen [19, 20, 22]). Therefore, 
if the axioms of set theory are consistent, then the generalized continuum hypothesis 
can be neither proved nor disproved within set theory. The generalized continuum 
hypothesis is thus logically independent from set theory. Hence there exist two 
mutually exclusive extensions of set theory, one extension with the generalized 
continuum hypothesis, the other extension with the negation of the generalized 
continuum hypothesis. 


4.8.7 Exercises on Infinite Sets 


4.121. Prove that if A is infinite and if A C B, then B is also infinite. 


4.122. Prove that if A is denumerable and Z ¢ A, then AU{Z} is denumerable. 
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4.123. With A denumerable, B finite, disjoint, prove that AUB is denumerable. 
4.124. Prove that if A is denumerable and B finite, then A U B is denumerable. 
4.125. Prove that if A and B are denumerable, then A U B is denumerable. 
4.126. Prove that there exists a bijection from 2 to A(N). 

4.127. Prove that A(N) is not countable. 

4.128 . Prove: if A is denumerable and B is finite, then A x B is countable. 
4.129. For each nonempty set X prove that there is no injection 2% — X, 
4.130. Prove that if [#(X)] = [#(Y)], then [#(2*)] = [#(2”)]. 

4.131. Prove: if [#(X)] < [#(Y)] are both finite, then [#(2*)] < [#(2”)]. 
4.132. Prove that if X is a finite set, then 2% is also finite. 

4.133. Prove that if X is a finite set, then [#(X)] < [#(2*)]. 

4.134. Prove that every infinite subset S C N is denumerable. 

4.135 . For each denumerable set A prove that every subset S C A is countable. 
4.136. Prove: if A is uncountable and A C B, then B is uncountable. 

4.137. Prove that if A, B, and C are denumerable, then so is (A x B) x C. 
4.138 . Prove that if A x B is denumerable, then A and B are countable. 

4.139 . Show where the proof of theorem 4.179 invokes the Axiom of Choice. 
4.140 . Show where the proof of theorem 4.177 invokes the Axiom of Choice. 
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Chapter 5 
Well-Formed Sets: Proofs by Transfinite 
Induction with Already Well-Ordered Sets 


5.1 Introduction 


This chapter focuses on “well-formed” sets, which are defined by means restricted 
to the axioms of Zermelo-Fraenkel set theory from chapters 1, 2, 3, and 4. The 
main result states that no two well-formed sets are members of each other, and 
consequently that every well-formed set is not an element of itself. 

This chapter shows the dependence of one axiom on the others for sets — called 
well-formed sets — defined in specific ways solely through the axioms of set theory. 
Such well-formed sets suffice for most of logic, mathematics, computer science, and 
their applications to the sciences and engineering. Among other features, the result 
shows that no well-formed set is an element of itself, and that this result is provable 
from the other axioms of set theory [74]. This chapter also provides a way to revisit 
induction (chapter 4) on a different level. 


5.2. Transfinite Methods 


5.2.1 Transfinite Induction 


Transfinite methods lead to an example of decidability in set theory. On the set 
N, the Principle of Mathematical Induction (theorem 4.15) is logically equivalent 
to the Well-Ordering Principle (theorem 4.47), which states that every nonempty 
subset S C N has a smallest element. All well-ordered sets also lend themselves to 
a method of proof known as transfinite induction, which relies on “initial intervals” 
in well-ordered sets. All sets in this chapter are already well-ordered. 


5.1 Definition. For each set W well-ordered (definition 3.194) by a relation ~ and 
for each C € W, the initial interval determined by C is the subset 
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Wo :={BeEW: (Bx<C)A (BF O}}, 


which consists of all elements preceding C but different from C. 
5.2 Example. If W = N, with < for <, then Ny = N for each N € N. 
The Principle of Mathematical Induction extends to all well-ordered sets. 


5.3 Theorem (Transfinite Induction). For each set W well-ordered by < and for 
each set V, if V © W, then V = W if and only if 


VC{[(C € W) A (We € V)] => (CE V)}. 


Proof. If V = W, then the formula is a tautology: [(P) A (Q)] = (P). For the 
converse, let U := W\ V. If U 4 @, then U has a first element A € U. Thus, if 
B < A but B # A, then B ¢ U, whence B € W \ U = V. Hence the initial interval 
Wa C V, but then A € V by hypothesis on V, which contradictsA €e U=W\V. O 


Well-ordered sets also lend themselves to a method of definition known as 
transfinite construction, which relies on the concept of “ideal” in a well-ordered 
set. 


5.4 Definition (ideal). For each set W well-ordered by a relation < a subset V C 
W is an ideal of W if and only if V contains every element preceding any of its 
elements. Thus V is an ideal if and only if Wc C V for every C € V: 


VBYC{[(B € W) A (B < C)A(CEV)] > (BEV). 


The set of all ideals of W relative to < is denoted by -7(W). 
5.5 Example. If W = N, with < for <, then N is an ideal for each N € N. 
The following two theorems yield relations between ideals and initial intervals. 


5.6 Theorem. For each set W well-ordered by a relation < and for each ideal 
VCW, ifB € W\ V, then V C Ws. 


Proof. By definition of an ideal, if C € V and B € W with B < C, then B € V. 
By contraposition, C ¢ V follows from B € W \ V, and C € W with B <« C. 
Because << totally orders W (remark 3.195), it follows that if B ¢ W \ V, then 
(W \ V) C [W \ (We U {B})], whence V C Wz U {B}. Yet B € V, whence V C Wz. 

oO 


5.7 Theorem. For each set W well-ordered by a relation < and for each ideal 
V S W, there exists a smallest element A € W \ V, and for this element Wa = V. 


Proof. If V & W, then W \ V # ©. Because W is well-ordered by << it follows that 
W \ V contains a smallest element A € W \ V. Thus for every B € W such that 
B < Aand B ¥ A, it follows that B € V by minimality of A in W \ V. Therefore 
W, © V. Moreover, V C W, by theorem 5.6. oO 
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The following theorems show that the set of all ideals is well-ordered by 
inclusion. 


5.8 Theorem. For each set W well-ordered by a relation ~ and for each nonempty 
set ¥ of ideals of W, the intersection (| F is also an ideal of W. 


Proof. If C € (\ F, then C € V for each ideal V € .¥. Hence, if B € W and B < C, 
then B € V because V is an ideal. This conclusion holds for each V € .¥, whence 
Beé()F. Thus () F is an ideal of W. Oo 


5.9 Theorem. For each set W well-ordered by a relation ~ and for each nonempty 
set F of ideals of W, the smallest element of F is (\ #. Therefore the set ¥<(W) 
is well-ordered by inclusion (C). 


Proof. For each nonempty set ¥ of ideals of W, there exists at least one ideal 
U € F, and the intersection () F is also an ideal, by theorem 5.8. Let 


z:={cew: ce(UF)\(MF)}. 


If Z = @, then |) ¥ = ()-F, whence -¥ contains only one ideal, in effect U = 
(\F. 

If Z # @ then it has a smallest element A € Z C W. Because () ¥ is an ideal 
by theorem 5.8, and because A ¢ ()-F, it follows that W4 = (| F by theorem 5.7. 

Also because A ¢ (| .F, there exists an ideal V € ¥Y withA ¢ V. Hence V C Wg 
by theorem 5.6. 

From V C Wg, and Wa = ()-F¥ it follows that V C ()-¥, but ().F C V. 
Consequently (\.¥ =Ve F. 

Therefore every nonempty set #% C .%.(W) has a smallest element, in effect 
(\-F, so that .7.(W) is well-ordered by set inclusions. oO 


5.2.2 Transfinite Construction 


As the Principle of Mathematical Induction leads to a method of definition by 
induction (theorem 4.21) — also called a recursive definition or recursion — 
similarly transfinite induction also yields a method of definition by transfinite 
induction. 


5.10 Theorem (Transfinite Construction). For each nonempty set W_ well- 
ordered by a relation < with first element A € W, and for each nonempty set 
E, let 


Y:= Lue 


Bew 
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denote the set of all functions with domain equal to an initial interval Wg and with 
range in E (definition 4.153). For each Z € E, and for each function P: Y — E, 
there exists exactly one function F : W — E such that F(A) = Z and F(B) = 
P (Flw,)- 


Proof. This proof establishes the uniqueness and existence separately. 


Uniqueness 


There is at most one such function. Indeed if fF: W — EandG: W — E are two 
such functions, with F(A) = Z = G(A) and F(B) = P(F\y,), G(B) = P(Glw,), 
then let 


S:={Cew: F(C) 4 G(C)}. 


If S # @, then S has a smallest element, D € S. Hence F(B) = G(B) for every 
B < Din W, which means that F|w, = G|w,, but then 


F(D) = P(Flwp) = P (Glwp) = GD) 


would contradict D € S. Consequently, S = @, so that F(B) = G(B) for every 
B € W, which means that F = G. 


Existence 


Let Y denote the set of all ideals V CG W for which there exists a function 
Fy: V > E such that Fy(A) = Z and Fy(B) = P (Fy|w,). 

For each ideal U € #, applying the uniqueness just proved to the well-ordered 
set UM V instead of W shows that the functions Fy and Fy coincide on the well- 
ordered subset UNM V in W. 

Hence, define a function Fg : J) ¥ — E by setting Fz(B) := Fy(B) for 
any ideal U € ¥Y with B € U. In other terms, Fg = Uyeg Fu. The preceding 
argument confirms that this definition does not depend on which ideal U contains 
B, because if B € U and B € V, then Fy(B) = Fy(B). 

Next, if an ideal U is an initial interval, V = Wz for some B € W, and if Wz € .F, 
then Wz U {B} € F. Indeed, a function Fy : U — E extends to Wz U {B} by the 
definition Fy,use}(B) := P(Fu). 

Suppose that W ¢ #. Then let Y denote the set of all the ideals of W that are not 
elements of . In particular, W € Y. Define V := (\Y%, which is then the smallest 
ideal of W in Y. If V had a last element D, then V = Wp U {D} by definition of 
Wp and of an ideal; however, Wp € -¥, otherwise V 4 ()Y, but from Wp € -F it 
follows that Wp U {D} € #. Thus V cannot have a last element. 
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If V does not have a last element, then V = J nev We by definition of an ideal. 
Again, it follows that Wg € ¥ for each B € V, whence V = Upey Ws C UF 
and then V € ¥ because of the existence of F.z, contradicting the definition of V. 
Therefore, W € .Y, which means that F extends to all of W. oO 


5.2.3. Exercises on Transfinite Methods 


5.1. Prove that in each well-ordered set each initial interval is an ideal. 


5.2. Provide an example of an ideal that is not an initial interval in a well- 
ordered set. 


5.3. Prove that if Z is well-ordered by =, then = differs from <. 
5.4. Prove that if Q is well-ordered by x, then = differs from <. 


5.5. Provide an example of a well-order < on a set of modular integers Zy = 
{[O]u,-.-,[M — 1]v} and modular integers [7], [K]u, [E]m, such that [K]y < [Llu 
but [Ju + [K]u A u + [E]u. 

5.6. Provide an example of a well-order < on a set of modular integers Zy = 
{[O]u,-.-,[M — 1],y} and modular integers [/]y, [K]u, [L]u, such that [O]y ~ [|v 
and [K]y ~ [L]y but [I]u * [K]u 4 [Ju * [L]u. 


5.7. Prove that every subset of a well-ordered set is also well-ordered. 


5.8. Determine whether for each set of ideals in W the union |) F is also an 
ideal in W. 


5.9. Determine whether for each set Y of initial intervals in W the union |) ¥ is 
also an initial interval in W. 


5.10. Determine whether for each set Y of initial intervals in W the intersection 
()@ is also an initial interval in W. 


5.3 Transfinite Sets and Ordinals 


5.3.1 Transitive Sets 


Sets defined exclusively through the axioms of set theory adopted here are called 
well-formed sets. They have the advantage of avoiding certain contradictions that 
would arise from defining sets by means not so strict. The definition of well-formed 
sets involves the concept of sets that are “transitive” relative to the relation €. 
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5.11 Definition (Transitive Sets). A set A is transitive if and only if every element 
of A is also a subset of A, so that VX[(X € A) > (X C A)], or, equivalently, 


VYVX{[(Y € X) A (X € A)] > (Y € AD}. 


5.12 Example. The following sets are transitive: 


©, 

{}, 

1, {2}}, 

| 2, {2}, {@, {o}} \, 

|: {2}, {{o}!, {o, {o}} i 


5.13 Counterexample. The set A := { {@}} is not transitive, because it contains 
an element X := {@} that is not a subset of A: X Z A, because @ € X but @ Z A. 


Power sets, unions, and intersections of transitive sets are also transitive. 
5.14 Theorem. [fa set A is transitive, then P(A) is also transitive. 


Proof. If S € AA), then S C A. Thus if X € S, then X € A, and X C A by 
transitivity of A. Hence X € A(A) for each X € S, whence S C P(A). Oo 


5.15 Theorem. /f a set F is transitive, then |) F is also transitive. 


Proof. If S € J F, then there exists A € ¥ with S € A. YetA C F by transitivity 
of F. From S € AandA C F follows S € ¥, whence S C (J -F. oO 


5.16 Theorem. /f is a nonempty set of transitive sets, then |) F and (\F are 
also transitive. 


Proof. lf S ¢ J F, then S € A for some A € ¥, whence S C A by transitivity of 
A, and hence S C A C (J ¥, so that J F is transitive. 

If S € (| F, then S € A for every A € ¥, whence S C A by transitivity of A, 
and hence S C (| .¥, so that (| F is transitive. o 


5.3.2. Ordinals 


Well-formed sets will rely on the concept of ordinals (also called “ordinal numbers” 
[30, p. 42]). The following definition conforms to Kunen’s [74, p. 16]. 


5.17 Definition (ordinals). A set A is an ordinal if and only if it is a transitive set, 
and the relation € is an irreflexive well-ordering on the set A. 
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5.18 Example. The following sets are ordinals: 


©, 

{O}, 

(2, {23}, 

{ B, {D}, {, {o}} \. 


5.19 Counterexample. The set 


A= { 2, {2}, {{2}, {, (2}} | = P({2, 23}) 


is transitive but not an ordinal. Indeed, if 


then X € A and Y € A, so that {X, Y} C A, but {X, Y} does not have any smallest 
element relative to the relation €, because X ¢ Y and Y ¢ X. 
In particular the subset {X, Y} is not an ordinal. 


5.20 Theorem. The empty set is an element of every nonempty ordinal. 


Proof. By definition, every ordinal A is transitive, so that if X € A then X C A. 
Consequently, if Y € X and X € A, then Y € A. Therefore, if ¥ € A and X # @, 
then X is not the smallest element of A. Yet every nonempty ordinal A has a smallest 
element. Hence contraposition shows that the smallest element must be @. oO 


5.21 Theorem. /f A is an ordinal, then A ¢ A. Moreover, A ¢ X for each X € A. In 
particular, if A and X are ordinals, then A € X or X ¢ A. 


Proof. If A is an ordinal, then € is connected (definitions 3.185, 3.194) and 
irreflexive (definition 5.17): exactly one of X € Y, X = Y, or Y € X holds for 
all X,Y € A. Because A = A, it follows that A ¢ A. Moreover, if A is an ordinal 
and X € A, then X C A, whence if Y € X then Y € A. With Y := A, it follows by 
contraposition that A ¢ X. Oo 


5.22 Theorem. Every element of an ordinal is an ordinal. 


Proof. If A is an ordinal and X € A, then X C A because A is a transitive set. Hence 
€ well-orders X, because € well-orders A. If moreover Z € Y and Y € X, then 
Z € X because A is a transitive set, whence Y C X. Thus X is also a transitive set. 
Furthermore, the relation € remains irreflexive on the subset X C A. oO 


5.23 Theorem. /f A is an ordinal, then either A contains a last element D and 
A = DU {D}, or A = A is the union of all its elements. 
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Proof. If A is an ordinal and B € A, then B C A; consequently, JA C A. 

Let D := JA. If D ¥ A, then there exists X € A \ D. Then X C D because 
X €A; hence X is an ordinal by theorem 5.22, whence X # A by theorem 5.21, and 
XCUA=D. 

Conversely, still with X € A \ D, for each B € A it follows that X ¢ B, whence 
B € X, and hence B C X, so that ,., B C X. Thus, D = JA = Une, BEX. 

Therefore, if D # A, then D C X C D, whence X = D is the only element of 
A \ D, and hence A = DU {D}. Oo 


5.24 Theorem. /f B is an ordinal, then B U {B} is also an ordinal. 


Proof. First, B U {B}is transitive. Indeed, if X € BU {B}, then either X € B, whence 
XCBCBU {B}, or X € {B}, whence X = BC BU {B}. 

Second, € well-orders B U {B}. Indeed, for each nonempty subset S C BU {B}, 
two situations can occur: S C Bor SM {B} # @. If S C B then S has a smallest 
element, because B is an ordinal. If SN {B} # @, then either S = {B} has the 
smallest element B, or SM B # @ and then S/N B is a subset of B and hence has a 
smallest element, which is then also a smallest element of S = (SM B) U {B}. 

Moreover, the relation € remains irreflexive and transitive on B U {B}. Indeed, 
B ¢ B by theorem 5.21. Furthermore, if X € Y and Y € Zin B U {B}, then Y € B 
from either Z € B or Z € {B}. Also, Y # B by theorem 5.21, which forbids B € Z 
and Z € BU {B}, and hence X # B also by theorem 5.21, which forbids B € Y 
Y € Z, and Z € BU {B}. Consequently only two cases can occur: Z € B or Z = B. 

If Z = B, then X ~# Band Y ¥ B, whence X € Z. 

If Z € B, then X, Y, and Z all three lie in B, whence X € Z because € is transitive 
on the well-ordered set B. 

Finally, € is strict on B U {B}. Indeed, because € is strict on B, it follows that if 
X € BU {B} and Y € BU {B}, then two different cases can arise. 

If X ¢ Band Y ¢€ B, then X € Y and Y € X cannot both hold, for transitivity 
would yield X € X which cannot hold by strictness of € on B. 

If X € Band Y € {B}, then X € Y and Y € X cannot both hold. Otherwise Y = B 
and then X € B and B € X. However, X is also an ordinal by theorem 5.22, whence 
€ is also transitive on X, so that B € X and X ¢€ B yield B € B, which cannot hold 
by strictness of € on X. Oo 


5.3.3 Well-Ordered Sets of Ordinals 


The following theorems show that every set of ordinals is well-ordered. First C is 
strongly connected (definition 3.184) on every set of ordinals. 


5.25 Theorem. For all ordinals A and B, either A C B, orA = B, or BCA. 


Proof. \f B £ A, then there exists X € B \ A, and hence there exists a smallest such 
element: X € B \ A, so that X € Y for every Y €e B\ A with Y 4 X. Also, X # @, 
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because X ¢ A but @ € A. From X € B it follows that X C B, whence if Y € X, then 
Y € B, but then Y € A because of the minimality of X. Thus X CAN B. 

Conversely, if Y € AM B, then Y € B, whence either Y € X or X € Y, because 
X € B also. However, X € Y cannot occur, because X € Y and Y € AM B would 
yield X € A. Thus, if Ye AM B, then Y € X, which means thatA MN BC X. 

Consequently, X =ANB. 

The foregoing argument with A and B switched shows that if A Z B, then Z := 
BONA is the smallest element in A \ B. In particular, Z= BNA =ANB=xX. 

Consequently, if A Z B and B ¢ A both held, then X := AM B =: Z would be 
the smallest element in both A \ B and B \ A. 

However, because (A \ B) N (B \ A) = @, it follows that X ¢ (A \ B) N (B\ A). 
Thus at least one of B Z A or A & B must fail to hold, which means that B C A or 
A C B or both, so that either A C B,orA = B,orBCA. oO 


The second theorem shows that € is strongly connected on every set of ordinals. 
5.26 Theorem. For all ordinals A and B, either A = B, orA € B, orBeé A. 


Proof. Consider the sets C := A U {A} and D := BU {B}, which are ordinals by 
theorem 5.24. Applying theorem 5.25 to C and D instead of A and B shows that 
DCCorCCD.IfC CD, thenA € CC D= BU {B}, so that either A € B or 
A=B.If DCC, thenBe DOC C=AU {A}, so that either Be Aor B= A. oO 


5.27 Theorem. Every set ¥ of ordinals is well-ordered by €. 


Proof. The relation € is irreflexive on ¥ by theorem 5.21, and it is connected by 
theorem 5.26. The relation € is also transitive on .¥. Indeed, for all A, B,C € F, if 
Aeé Band Be C, then B C C whenceA € C. 

Moreover, for each nonempty subset ¥ C .F, there exists some C € Y. For each 
Be @, either B € C, or B = C, or C € B, by theorem 5.26. Define 


E:={BeG:BEC}=CNGY. 


If E = @, then C is the smallest element of Y, because then C € B for each B € Y 
with B # C. If E # @, then E has a smallest element A € E, because E is a 
subset of the ordinal C. If B € Y, then B ¢ A, because B € A would yield B € C, 
contradicting the minimality of A. Hence A € B for every B € G with B F A. oO 


5.3.4 Unions and Intersections of Sets of Ordinals 


The union and the intersection of every nonempty set of ordinals is an ordinal. 
5.28 Theorem. For each set ¥ of ordinals, |) ¥ is also an ordinal. 


Proof. The union |) F is transitive, by theorem 5.15. 
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The union U F is well-ordered. Indeed, for each nonempty subset S C U F, 
let 


E:={Ae F:ANSF O}. 


From S$ # @ follows E #4 @. Hence E C ¥ has a smallest element A, because 
F is an ordinal, by theorem 5.27. Thus SM A # @. Therefore, SMA C A also 
has a smallest element B € SMA. Also, for each C € S there exists D € ¥Y with 
C ¢€ D. Then either A = D, or A € D, or D € A. Yet D € A cannot occur, by 
minimality of A. From B € A, with A € D or A = D, follows B € D; hence B € D 
and C € D. Consequently, either B € C or C € Bor B = C, but C € B cannot occur 
by minimality of B and because B € A. Thus, B € C or B = C, which shows that B 
is the smallest element of S. By theorem 5.21 € is strict on ) ¥, for every element 
is an ordinal by theorem 5.22. oO 


5.29 Theorem. For each nonempty set ¥ of ordinals, (\.¥ is also an ordinal. 


Proof. The intersection (| -¥ is a transitive set by theorem 5.16. The intersection 
(| F is also well-ordered. Indeed, each nonempty subset S C ( )F is also a subset 
S CA of some A € ¥ and hence S has a smallest element, because A is an ordinal. 

The relation € is a strict total order on () F, as it is on every subset of A € F, 
where it is strict. Indeed, if X € () ¥ and Y € ()\.F, then X € A and Y € A whence 
X € Yand Y € X cannot both hold. oO 


5.3.5 Exercises on Ordinals 


5.11. Prove that N is a transitive set. 

5.12. Prove that every natural number N € N is a transitive set. 

5.13. Prove that N is an ordinal. 

5.14. Prove that every natural number N € N is an ordinal. 

5.15. Investigate whether € is a transitive relation on every transitive set. 
5.16. Prove that every ordinal is an element of some ordinal. 

5.17. Prove that every ordinal is a subset of some ordinal. 

5.18 . Determine whether every ordinal is a subset of some transitive set. 
5.19 . Determine whether every transitive set is a subset of some ordinal. 


5.20. Verify that it is not necessary to assume that V # @ for Transfinite Induction. 
In other words, prove that for each set W well-ordered by a relation < and for each 
subset V C W, if the following formula is True, 


YC{[(C € W) A (Wc € V)] > (CE VI, 


then either W = @ or VF @. 
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5.21. Determine whether every singleton {A} with an ordinal A is an ordinal. 
5.22. Determine whether {A, B} is an ordinal for all ordinals A and B. 

5.23. Determine whether every set of ordinals is an ordinal. 

5.24. Determine whether every subset of every ordinal is an ordinal. 

5.25. Prove that there is an ordinal whose power set is not an ordinal. 


5.26. Prove that every countable set admits a well-ordering. 


5.4 Regularity of Well-Formed Sets 


5.4.1 Well-Formed Sets 


The following definition establishes sets that contain all the well-formed sets. 

5.30 Definition. For each ordinal A, define a set R(A) by transfinite construction: 

« R(@):= @; 

© R(AU{A}) = PRA); 

© R(A) := Use, R(B) if there does not exist any ordinal B such that A = BU {B}, 
but if R(B) has been defined for every B € A. 


5.31 Definition. A set X is well-formed if and only if there exists an ordinal A such 
that X € R(A). 


5.32 Remark. The transfinite construction proceeds as follows. For each ordinal W, 
let 


E:= P|PW)], 
Y :=UnewE™, 
P:Y > E, 


P(Ri ws) >= Uses RB). 


The values of P remain in E, because if R|w, : Wa — E, then R(B) € E 
P|P(W)] for each B € Wa = A, whence R(B) C A(W), and hence (Jz<, R(B) 
PW), so that Une, R(B) € YPLA(W)]. Yet for all ordinals U and V, U C 
or V C U by theorem 5.25, so that P[AU)| C AlLYA(V)] or P[AV)| 
YP|AU)]. Therefore, the transfinite construction defined with W := U or W := V 
gives the same definition on UN V. In other words, for each ordinal A, the definition 
of R(A) can proceed from any ordinal W containing A, for example, W := A U {A}. 


IN <IN Il 
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5.33 Example. The first few sets of the form R(A) are also ordinals: 


R(@) = @, 
R({D}) = R(SP VU{O}) = P[R(O)] = P[O] = {B}, 
R ({G, {S}}) = R({O} VU {{}}) = P[R{O})] = P{O}] = {o, {o}}. 


The next ordinal, 
A= \ DB, {D}, {, {OF} \, 


is not of the form R(B) for any set B, but A = BU{B} for the ordinal B := {2, {oy}. 
Hence, the list continues with 


R({ B, {@}, {S, {o}} \) 


R({2. {23} U {{o, o}}\) 
P |R((2,{23})] 

P |{2,{23}] 

{ 2. {2}. {{2}}. {2.103} }. 


II 


II 


which is of the form R(A), but it is not well-ordered by € and hence not an ordinal, 
by counterexample 5.19. The set R(A) is also different from the ordinal 


DB, {D}, {, {oH}, {2, {DO}, {2, {23}! . 


5.34 Theorem. For each ordinal Q, the set R(Q) is transitive. 


Proof. This proof proceeds by transfinite induction (theorem 5.3). 

Choose an ordinal W with Q € W, for instance, W := Q U {Q}. Then consider 
the subset V C W of all the elements C € W for which R(C) is transitive. 

Consider any element C € W such that Wc C V. By transfinite induction, it 
suffices to verify that C € V to establish that V = W. Because 


Wo={BEW: (BEC)A(BFC)} 


by definition 5.1 for the relation € instead of <, it follows that Wc = C. From 
C = We C V it then follows that R(B) is transitive for each B € C by definition 
of V. Consequently, (J,<- R(B) is also transitive, by theorem 5.15. 

Two cases can arise, either C does not have a last element, or C has a last element. 

If C does not have a last element, then R(C) = gec R(B) is transitive. 

If C has a last element Z € C, then C = Z U {Z}. From Z € C = We C V, it 
follows that R(Z) is transitive by definition of V. Hence A[R(Z)] is also transitive, 
by theorem 5.14. Yet R(C) = R(Z U {Z}) = A|[R(Z)], whence R(C) is transitive. 
Thus in either case R(C) is transitive, whence C € V, and thence V = W. 

Finally, from Q € W = Vit follows that R(Q) is transitive by definition of V. O 
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5.4.2 Regularity 


The following theorems confirm that every element, subset, pairing, power set, 
union, intersection, and Cartesian product of well-formed sets is again a well- 
formed set. 


5.35 Theorem. For each well-formed set X there is a smallest ordinal A with 
X € R(A). 


Proof. If X is a well-formed set, then there exists an ordinal C such that X € R(C) € 
P[R(C)] = R(C U {C}). Hence X € R(C U {C}) by transitivity (theorem 5.34). By 
theorem 5.27, there exists a smallest ordinal A € C U {C} such that X € R(A). 

For every ordinal D, either D = A, or D € A € CU {C} and then X ¢ R(D), or 
A € D and then D is not the smallest such ordinal. Oo 


5.36 Theorem. /f X is well-formed, then every Y € X is well-formed. 


Proof. If X is well-formed, then there exists an ordinal A such that X € R(A). 

If A = @, then X € R(A) = R(@) = @, whence Y € X is vacuously well- 
formed. 

If there exists an ordinal B such that A = B U {B}, then X € R(A) = Y[R(B), 
whence X C #|[R(B)] by transitivity of A[R(B)], Consequently, Y € R(A) = 
Y|R(B)] is well-formed for each Y € X. If R(A) = Upe, R(B) and the theorem 
holds for each Z € R(B) for each B € A, then for each X € R(A) there exists B € A 
such that X € R(B) whence every Y € X is also well-formed. Oo 


5.37 Theorem. /f X and Y are well-formed sets, then so are {X,Y}, P(X), UX, 
X x Y, every subset of X, and ()X provided X # ©. 


Proof. If X and Y are well-formed sets, then there exist ordinals A and B such that 
X € R(A) and Y € R(B). Either A = B (whence R(A) = R(B)), or A € B (whence 
R(A) © R(B)), or B € A (whence R(B) C R(A)). For instance, assume that R(A) C 
R(B). Thus X € R(B) and Y € R(B), so that {X,Y} © A[R(B)| = R(B U {B}), 
whence {X, Y} is well-formed, because B U {B} is an ordinal. 

Similarly, if X € R(B) is a well-formed set, then X C R(B) by transitivity, whence 
P(X) C Y[R(B)| = R(BU {B}) and hence A(X) € P[R(BU {B})] = R(AU {A}) 
with A := BU {B}. Thus, A(X) is well-formed. 

Consequently, from Y(X) € R(AU {A}) it follows that A(X) C P[R(AU {A})], 
whence if S C X, then S € A[R(A U {A})] is well-formed. 

In particular, (| X is well-formed because () X C |) X and | X is well-formed. 

Also, if X € R(B) is a well-formed set, then X C R(B). Hence, if Z € Y and 
Y € X, then Y C X whence Z € X and Z € R(B). This shows that |) X C R(B). 
Consequently, |) X € A[R(B)] = R(B U {B}) is well-formed. 

In particular, because {X, Y} is well-formed, it follows that X U Y = | J{X, Y} is 
well-formed, whence A(X U Y), P[A(X U Y)], and A{Y[A(X U Y)]} are also 
well-formed. Therefore, X x Y€ P{P[|A(X U Y)]} is also well-formed. Oo 
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5.38 Theorem. For all well-formed sets X and Y, =|(X € Y) A (Y € X)]. In 
particular, X ¢ X for each well-formed set X. 


Proof. If X = @, then Y ¢ X and X ¢ X. 

If X # @ and ¥ € X, then it suffices to verify that X ¢ Y. To this end, let A be 
the first ordinal such that X € R(A); in particular, X C R(A) by transitivity of R(A). 

IfA = ZU {Z}, then X € R(A) = R(Z U {Z}) = YI[R(Z)], so X C RZ), 
whence Y € X C R(Z), and hence Y C R(Z). However, X ¢ R(Z) because A is 
the first ordinal with X € R(A). If A does not contain a last element, then R(A) = 
Use, R(B), whence X € R(A) means that there exists B € A with X € R(B), and 
then X C R(B) by transitivity of R(B). Yet X ¢ R(B) because B € A and A is the 
first ordinal with X € R(A); 

Thus, in either case there exists C € A such that X C R(C) and X ¢ R(C) and 
Y € R(C). (In the first case C := Z, while C := B in the second case.) 

Therefore X ¢ Y, because X € Y € R(C) and the transitivity of R(C) would yield 
X € R(C). In particular, X ¢ X, because the foregoing argument applied to Y := X 
shows that if X € X, then X ¢ X, contrary to the axioms governing €, which state 
that for all sets X and Y either X € Y or X ¢ Y but not both. Oo 


5.4.2.1 Independence of the axiom of regularity 


As an extension of theorem 5.38, every nonempty well-formed set X contains 
an element Y that does not contain any element of X, so that YN X = @, or, 
equivalently, so that there does not exist any Z € Y such that Z € X (exercise 5.31). 
Outside of well-formed sets, however, there exist systems of set theory in which the 
relation A € A may hold for some set. One way of preventing the relation A € A 
from holding for any set consists in the axiom of regularity, 


WX{(X # @) => (aY[(Y € X) A {VZ[(Z € Y) > (Z¢E X)}))}. 


attributed independently to Zermelo and von Neumann [128, p. 53]. The axiom of 
regularity has the disadvantage of asserting a condition about sets already defined by 
previous axioms. Yet within the theory of well-formed sets, exercise 5.31 confirms 
that the axiom of regularity is not independent but is a theorem derivable from 
the other axioms. Because well-formed sets suffice for most of logic, mathematics, 
computer science, and their applications, the foundations of these fields can restrict 
themselves to well-formed sets [74]. In contrast to the derivability of the axiom 
of regularity from the other axioms of the theory of well-formed sets, neither the 
generalized continuum hypothesis nor its negation are derivable from the other 
axioms of the theory of well-formed sets [19, 20, 44, 45]. Thus, the “axiom of 
regularity” is an example of an axiom that is “dependent” on the other axioms, 
whereas the generalized continuum hypothesis is an example of an axiom that is 
“independent” from the other axioms. 
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5.4.3. Exercises on Well-Formed Sets 


5.27. 
5.28. 
5.29. 
5.30. 
5.31. 
5.32. 
5.33. 


Prove that the set N of all natural numbers is a well-formed set. 

Prove that every natural number N € N is a well-formed set. 

Prove that the set Z of all integers is a well-formed set. 

Prove that the set Q of all rational numbers is a well-formed set. 

Prove that for each well-formed set X there exists Y € X such that Y1NX = ©. 
Prove that if X is a well-formed set, then so is {X}. 


For each well-formed set X, prove that if A is the smallest ordinal such that 


X € R(A), then there exists an ordinal B such that A = BU {B}. 


5.34. 
5.35. 


set. 


5.36. 


Prove that every finite set of well-formed sets is a well-formed set. 


Determine whether every countable set of well-formed sets is a well-formed 


Determine whether every set of well-formed sets is a well-formed set. 
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Chapter 6 
The Axiom of Choice: Proofs 
by Transfinite Induction 


6.1 Introduction 


This chapter presents several statements, which are called “principles” because 
they are well-formed formulae but not propositions, in the sense that neither of 
them nor their negations are theorems, in the Zermelo-Frenkel set theory. The 
first sections show how Zorn’s Maximal-Element Principle implies Zermelo’s Well- 
Ordering Principle, which in turn implies the Choice Principle. Thus any extension 
of the Zermelo-Frenkel set theory that includes Zorn’s Maximal-Element Principle 
as an axiom also includes the other two principles as theorems. From the Choice 
Principle, subsequent sections demonstrate the converse implications, known as 
Zorn’s Lemma and Zermelo’s Theorem, so that all three principles are logically 
equivalent within the Zermelo-Frenkel set theory. Hence all three principles are 
theorems in the Zermelo-Frenkel-Choice set theory, which includes the Choice 
Principle as the Axiom of Choice. The material also introduces yet other principles 
that are logically equivalent to the Axiom of Choice, for example, the principle of 
the distributivity of intersections over unions of families of sets. Any theory that 
requires any such equivalent principle thus also requires the Axiom of Choice. 
Other consequences of the Axiom of Choice include the existence of extrema for 
continuous functions on closed and bounded sets in Euclidean spaces. 


6.2 The Choice Principle 


This section presents several mutually equivalent forms of the Choice Principle. 
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6.2.1 The Choice-Function Principle 


One version of the Choice Principle relies on the concept of a “choice function”: 


6.1 Definition (choice function). For each set .F of nonempty sets, a choice 
function is a function C: ¥ — (J F such that C(A) € A for every set A € F. 


6.2 Example. If ¥ = @, then\).¥ = J @ = @: there exists exactly one function 
C: F¥ > UF, namely @ : @ > @, which is “vacuously” a choice function. 


6.3 Example. With 1 = {@}, if # = {1}, then) F = Uf{1} = 1 = {@}: there 
exists exactly one functionC: .¥ — |).¥,namely C: {1} > {9} with C(1) = 2, 
which is a choice function because C(1) € 1. 


6.4 Example. For each nonempty set A, if # = {A}, then there exists X € A, 
whence (A,X) € ¥ x |) F¥ = {A} x A, and C := {(A, X)} is a choice function. 


6.5 Example. With 1 = {@} and 2 = {9,1}, if AF = {1,2}, then UF = 
{1,2} = 1U2 = {@, 1} = 2. There are two choice functions C: {1,2} > {@, 1}: 

The requirement that C(1) € 1 = {@} imposes that C(1) = @. 

The requirement that C(2) € 2 = {@, 1} allows for the two possibilities: 

either C(2) = @ € {@, 1}, or C(2) = 1 € {@, l}. 

Thus the two choice functions are PF: ¥ — |) F with F(1) = 0 and F(2) = 0, 
orG: ¥ >|) F¥ with G0) = 0 and G(2) = 1. 


More generally, each finite set of nonempty sets has a choice function. 


6.6 Theorem. Within the Zermelo-Frenkel set theory, each finite set of nonempty 
sets has a choice function. 


Proof. This proof proceeds by induction on the number N € N of sets. 

For N = 0, example 6.2 shows that the empty set has a choice function. 

Example 6.4 proves the theorem for NV = 1. 

As an induction hypothesis, assume that the theorem holds for some VN = M € N 
and every finite set with exactly M elements, all nonempty. For each set Y with 
exactly M+ 1 elements, all nonempty, there exists B € Y, so that F := Y \ {B} has 
exactly M elements, all nonempty. By the induction hypothesis, there exists a choice 
function F: # — |). Example 6.4 shows that there exists a choice function 
G: {B} > | {B} = B. Because ¥ N{B} = @, the unionC := FUG: FU{B} = 
G > (UF) UB = U@ isa function. Also, C(A) = F(A) € A for each A € F 
and C(B) = G(B) € B for each B € {B}, whence C is achoice function forY. 


Still, theorem 6.6 applies only to finite sets. 


6.7 Example. No choice functions are known for Y = Y[A(N) \ {@}], which 
is the set of all nonempty sets of subsets of the set N of natural numbers, which 
corresponds to the set of all nonempty sets of real numbers between 0 and 1. 


The statement of the existence of choice functions is one version of the Choice 
Principle, called the Choice-Function Principle to distinguish it from other versions. 
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6.8 Definition (Choice-Function Principle). Each set of nonempty sets has a 
choice function. Specifically, the Choice-Function Principle is formula (6.1): 


va (Wa (Ae F) > (A#@)]} (6.1) 


= [ac{(c: F > JF) a[(vatde F) = Ic) can)if]) 


Neither the Choice-Function Principle 6.8 nor its negation are propositions in the 
Zermelo-Frenkel set theory: there it is merely formula (6.1). 

The concept of a “family” of sets provides a convenient way to specify more than 
one operation on a set, for instance, to choose more than one element from a set. 


6.9 Definition (family of sets). A family of sets ¥ = {A;: i € 7%} isa set (of sets) 
with a function: 4% > F,its Aj, froma set %, called the indexing set or set of 
indices, to F. 


6.10 Example (Self-Indexed Family of Sets). Each set (of sets) ¥ is a family of 
sets: the same set .% := F serves as an indexing set, and the identity function 
t: ¥ > F¥,EW Ag := E shows that ¥ = {E: Ee F} = {Ag: Ee F}. 


Still other versions of the Choice Principle rely on families of sets, as in 
definitions 6.11 and 6.12. 


6.11 Definition (family choice-function). For each family {A; : i € 7} of sets, a 
family choice-function is a function C: -% — Uje.¢ Ai such that C(i) € A; for 
each index i € Y. 


6.12 Definition (Family Choice-Function Principle). Each family of nonempty 
sets has a family choice-function. 


The Family Choice-Function Principle 6.12 and its negation are formulae but not 
propositions in the Zermelo-Frenkel set theory. However, theorem 6.13 shows that 
choice functions are equivalent to family choice-functions. 


6.13 Theorem. Within the Zermelo-Frenkel set theory, the Choice-Function Prin- 
ciple 6.8 is logically equivalent to the Family Choice-Function Principle 6.12 


Proof. If the Family Choice-Function Principle 6.12 holds and ¥ is a set of 
nonempty sets, then the self-indexed family ¥ = {E: E € ¥} from example 6.10 
has a family choice-function C: F > Ureg E = UF such that C(E) € E for 
each E € ¥. Thus C is a choice function for F. 

Conversely, if the Choice-Function Principle 6.8 holds and ¥ := {A;: 1 € F} is 
anonempty family of nonempty sets, indexed by a function]: .% — F, then there 
exists a choice function C: .¥ — |) ¥ such that C(E) € E for each E € ¥. The 
composite function Col: % > Uie g Ai, it Cl] = CAI] € Ai, is a family 
choice-function. Oo 
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Another version of the Choice Principle relies on choice functions only for 
pairwise disjoint sets, as specified by definition 6.14. 


6.14 Definition (Choice-Function Principle for Pairwise Disjoint Sets). Each 
set of pairwise disjoint nonempty sets has a choice function. 


Yet another version of the Choice Principle relies on functions and relations, as 
specified by definition 6.15. 


6.15 Definition (Choice-Relation Principle). For each relation R there exists a 
function F C R such that F and R have the same domain [128, p. 243, AC3]. 


In the Zermelo-Frenkel set theory, a relation is a subset of a Cartesian product: 
RCAxB. There is a different principle, called the Relational Axiom of Choice, 
where R may be a relation defined for all sets [88, p. 22]. 


6.2.2 The Choice-Set Principle 


This subsection provides different statements of the Choice Principle that are 
logically mutually equivalent within the Zermelo-Frenkel set theory. One version 
of the Choice Principle relies on “choice sets” rather than choice functions. 


6.16 Definition (choice set). For each set .Y of nonempty sets, a choice set is a set 
5c U F such that for each A € ¥,SMA contains exactly one element. 


As with the Choice-Function Principle 6.8, neither the Choice-Set Principle nor 
its negation are propositions in the Zermelo-Frenkel set theory: it is merely a 
formula, stated in words in definition 6.17. 


6.17 Definition (Choice-Set Principle). Each set of nonempty sets has a 
choice set. 


6.18 Example. If ¥ = ©, then |). ¥ = |) @ = @: there exists exactly one subset 
SCUF = @, namely S = @, which is “vacuously” a choice set, because SM A 
“vacuously” contains exactly one element for each A € ¥ = ©. 


6.19 Example. With 1 = {9}, if F = {1}, then JF = U{l} = 1 = {B}, 
whence there exists exactly one choice set S C |) .¥ = 1, namely S = 1. Indeed, 


the only element in ¥ = {1} is 1, and SM1 = 11 = {@}, which contains exactly 
one element, namely @ € 1. In contrast @ C U ¥ = 1isnotachoice set. 


Theorem 6.20 shows that choice sets are equivalent to choice functions. 


6.20 Theorem. Within the Zermelo-Frenkel set theory, the Choice-Function Prin- 
ciple 6.8 is logically equivalent to the Choice-Set Principle 6.17. 
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Proof. Each choice set S C |) ¥ corresponds to the choice function F's defined by 
i= \(4.x) eFx\|JF: xeAns}. 


Conversely, each choice function is a subset of a Cartesian product: F C ¥ x\J F. 
Thus F consists of pairs (A, X) with exactly one X € A for each A € ¥. Hence 


Spi= {x e|)#: FA{A € F)A[(A,X) € Fy} 


is a choice set, obtained from the second projection (example 3.87) of F. oO 


Still another version of the Choice Principle relies on choice sets only for 
pairwise disjoint sets, as specified by definition 6.21. 


6.21 Definition (‘Pairwise Disjoint’’ Choice-Set Principle). Each set of pairwise 
disjoint nonempty sets has a choice set. 


Theorem 6.22 shows that choice sets for any sets, and choice sets for pairwise 
disjoints sets, are mutually equivalent in the Zermelo-Frenkel set theory. 


6.22 Theorem. Within the Zermelo-Frenkel set theory, the Choice-Set Prin- 
ciple 6.17 is logically equivalent to the “Pairwise Disjoint” Choice-Set 
Principle 6.21. 


Proof. The “Pairwise Disjoint” Choice-Set Principle 6.21 is a particular case of the 
Choice-Set Principle 6.17. Thus the latter implies the former. 

To establish the converse implication, to each set & of nonempty sets corresponds 
the set F of pairwise disjoint nonempty sets defined by 


Al := {(A,X): (xeA} Cex (Ue), 
Fs= ':Acsyc Plex (Ve)]. 


The elements of ¥ are pairwise disjoint, because if A,B € & and A # B, then 
(A,X) # (B,Y) regardless of X € A and Y € B, whence A’ M B’ = @. Also, 
A’ # @ for each A’ € F, because A contains at least one element X by hypothesis, 
so that A’ contains at least one element (A, X). If the “Pairwise Disjoint” Choice-Set 
Principle 6.21. holds, then there exists a choice set S C L) ¥ C & x (U@) such 
that S$ A’ is a singleton for each A’ € #: there exists a unique X € A for which 
(A,X) € A’ S. The second projection of S yields a choice set T C |) & such that 
TMA = {X} is a singleton for each A € F. Oo 


The statements of the Choice-Set Principle 6.17 and the “Pairwise Disjoint” 
Choice-Set Principle 6.21 with a proof of their mutual equivalence in the Zermelo- 
Frenkel set theory similar to theorem 6.22 are in Zermelo’s [145]. 
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6.23 Definition (Axiom of Choice). The Axiom of Choice is any of the principles 
6.8, 6.14, 6.15, 6.17, 6.21, included as an axiom in a set theory. 


6.2.3 Exercises on Choice Principles 


6.1. Find the number of choice functions for each finite set F = {Ao,...,Ay_1} 
of nonempty finite sets where each element A; has exactly N; elements. 


6.2. Find the number of choice sets for each finite set F = {Ao,...,Ay—1} of 
nonempty finite sets where each element A; has exactly N; elements. 


6.3. Prove that each finite set of nonempty finite sets has a choice set. 


6.4. Prove that the Choice-Function Principle 6.8 is logically equivalent to the 
“Pairwise Disjoint’” Choice-Function Principle 6.14. within the Zermelo-Frenkel 
set theory. 


6.5. Translate the Choice-Set Principle 6.17 into a logical formula, similar to 
formula (6.1) for the Choice-Function Principle 6.8. 


6.6. Translate the “Pairwise Disjoint’ Choice-Set Principle 6.21. into a logical 
formula, similar to formula (6.1) for the Choice-Function Principle 6.8. 


6.7. Prove that the Choice-Relation Principle 6.15 is logically equivalent to the 
Choice-Function Principle 6.8 in the Zermelo-Frenkel set theory. 


6.8. Translate the “Pairwise Disjoint’” Choice-Function Principle 6.14 into a 
logical formula, similar to formula (6.1) for the Choice-Function Principle 6.8. 


6.3. Maximality and Well-Ordering Principles 


This section introduces two principles and shows that each of them implies the 
Choice Principle. 


6.3.1 Zermelo’s Well-Ordering Principle 


As the Choice-Function Principle 6.8, neither Zermelo’s Well-Ordering Principle 
nor its negation are propositions in the Zermelo-Freenkel set theory: it is merely a 
formula, stated in words in definition 6.24. 


6.24 Definition (Zermelo’s Well-Ordering Principle). Each set has a well-order. 
[88, p. 117]. 
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6.25 Example. The order < is a well-order on the set N of all natural numbers. 


6.26 Example. Every finite set has a well-order. Indeed, by definition of “finite” for 
each finite set E there exists a natural number N € N and a bijection F: E > N. 
The subset N C N is well-ordered by example 6.25 and theorem 3.197. Hence the 
relation < defined on E by X < Y if and only if F(X) < F(Y) well-orders E. 


Theorem 6.27 shows that the existence of a choice function on a set follows from 
the existence of a well-ordering on its union. 


6.27 Theorem. For each set F of nonempty sets with a well-ordered union |_) F, 
there exists a function C: ¥ — |) F such that C(A) € A for every setA € F. 


Proof. Let C(A) be the first element of A relative to the well-order on ) ¥. Oo 


6.28 Example. If ¥ = A(N) \ {@}, which is the set of all nonempty subsets of the 
set N of natural numbers, then ) ¥ = N, where < is a well-order. By the proof of 
theorem 6.27, the function C: # — N with C(A) = min(A) chooses the smallest 
element of A and thus is a choice function for .¥. 


6.29 Example. No well-orders are known for & = A(N), which is the set of all 
subsets of the set N of natural numbers, which is also isomorphic to the set of all real 
numbers from 0 through 1. Because # = (JY with Y = Y[A(N) \ {O}] from 
example 6.7, any specific well-order on # would yield a specific choice function on 
¢, by theorem 6.27. 


Theorem 6.30 shows a logical relation between the foregoing principles. 


6.30 Theorem. Zermelo’s Well-Ordering Principle 6.24 implies the Choice- 
Function Principle 6.8. 


Proof. Theorem 6.27 shows that if every set admits a well-ordering relation, in 
particular, |_) ¥, then every set ¥ of nonempty sets has a choice function. Oo 


As sets (of pairs), relations are partially ordered by inclusion. Theorem 6.31 
shows that every total order is maximal relative to inclusion among partial orders. 


6.31 Theorem. Each reflexive total order on a set is maximal among all partial 
orders on that set. 


Proof. If R is a reflexive total order on a set A, and if T is a relation on A such 
that R ¢ T, then there are X,Y € A such that (X,Y) € T but (X,Y) ¢ R, whence 
(Y,X) € R by totality of R. Hence X ¥ Y by reflexivity of R. Consequently, (Y, X) € 
RC T and (X,Y) € T with X ¥ Y. Thus T is not anti-symmetric and therefore not 
a partial order. Therefore, R is not properly contained in any partial orderon A. O 


The concept of maximality in a partially ordered set leads to Zorn’s Maximal- 
Element Principle. 
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6.3.2 Zorn’s Maximal-Element Principle 


As the Choice-Function Principle 6.8 and Zermelo’s Well-Ordering Principle 6.24, 
neither Zorn’s Maximal-Element Principle nor its negation are propositions in 
the Zermelo-Frenkel set theory: it is merely a formula, stated in words in 
definition 6.32. 


6.32 Definition (Zorn’s Maximal-Element Principle). In each partially ordered 
set where each chain has an upper bound, there is a maximal element [88, p. 117], 
[128, p. 248, Zo]. 


Theorem 6.33 shows another logical relation between the foregoing principles. 


6.33 Theorem. Zorn’s Maximal-Element Principle 6.32 implies Zermelo’s Well- 
Ordering Principle 6.24. 


Proof. This proof follows Dugundji’s [30, p. 34, (2) = (3)]. For each set X, let 
2 be the set of all pairs (A, R) where R weakly well-orders a subset A C X. Thus 
BR C[A(X)] x [A(X x X)] is well-formed. Also, 2 4 @, because (9,9) € BX. 

Define a relation < on 2 by (A,R) < (B,S) if and only if A C BandRCS 
with (X, Y) € S for all X € A and Y € B\ A. Then X partially orders 2: 


Reflexivity. (A,R) < (A,R) because A C AandRC RwithA\A=2@. 
Antisymmetry. If (A,R) x (B,S) with (B,S) < (A,R), thenA C BandRCS 
with B C A and S C R, whence A = B and R = S, so that (A, R) = (B,S). 
Transitivity. If (A,R) < (B,S) and (B,S) < (C,T), thenA C BC CandRC 
S CT. If also X € A and Z € C \ A, then two cases can arise: either Z € B \ A, 
orZEC\B. 
In the first case, if Z € B \ A, then (X, Z) € S C T because X € A. 
In the second case, if Z € C \ B, then (X, Z) € T because X € B. 
Thus (A, R) < (C,T). 


For each chain Y¥ C 2 linearly ordered by < in 2 define 


C:= 'o A, 
(A.R)EY 
T:= 'S R. 
(A.R)EY 


Then T weakly well-orders C: 


Reflexivity. For each X € C there exists (A,R) € Y such that X € A, whence 
(X,X) € RCT by reflexivity of R. 

Antisymmetry. For all X,Y € C there exist (A, R),(B,S) € Y such that X¥ € A 
and Y € B. Because Y is a chain, two cases can arise: either (A, R) < (B,S), 
or (B,S) < (A,R). In the first case, if (A,R) < (B,S), then X,Y € B; if also 
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(X,Y) € T and (Y,X) € T, then (X, Y) € S and (Y, X) € S by maximality of S 
on B from theorem 6.31, whence X = Y by anti-symmetry of S. The second case 
is similar. 

Transitivity. For all X, Y,Z € C there exist (A, R), (B, S),(D, U) € Y such that 
X €A,Y €B,Z € D. Because Y isachain, assumeA CBC DandRCS CU. 
(The other five orders are similar.) If also (X,Y), (Y,Z) € T, then there exist 
(E,V),(F,W) € Y such that (X,Y) € V and (Y,Z) € W. Again because Y is 
a chain, assume V C W, whence (X,Y), (Y, Z) € W, and hence (X, Y), (Y,Z) € 
U by maximality of U on D from theorem 6.31, whence (X,Z) € U C T by 
transitivity of U. 

Well-ordering. Each nonempty subset E C C contains at least one element Z € 
ECC= Ua pew A- Hence there exists (A,R) ¢ Y such that Z € A. 

Thus AN E 4 @, whence A 1 E has a least element X € AN E, because R 
well-orders A. 

Hence X is also the least element of E. Indeed, if Y € E, then there exists (B, S) € 
Y such that Y € B. Either Y ¢ A, or Ye B\A. 

In the first case, if Y € A, then (X, Y) € R C T, because X is the least element of 
ANE. 

In the second case if Y € B \ A, then (X, Y) € S by definition of < and hence 
(X,Y) ESCT. 

In either case (X, Y) € T. Therefore X is the least element of E. 


Thus (C,T) € 2, and (C, T) is an upper bound relative to < for Y. 

Hence, if Zorn’s Maximal-Element Principle holds, then 2 contains a maximal 
element (D, U). In particular, D = X: by contraposition, if D 4 X, then there would 
exist K € X \ D. Setting E := DU {K} and V := UU {(4,K): H € D} would 
define a strict upper bound such that (D, U) ~ (E, V), contradicting the maximality 
of (D, U). Therefore, U well-orders D = X. oO 


Definition 6.34 states another version of Zorn’s Maximal Principle. 


6.34 Definition (Zorn’s Maximal-Set Principle). In each nonempty set (of sets) 
that contains the union of each chain of elements (which are sets) relative to 
inclusion, there is a maximal element relative to inclusion: an element that is not 
a proper subset of any other element. [128, p. 245, Z)]. 


Max Zorn stated the Maximal-Set Principle 6.34 for sets of sets partially ordered 
by inclusion in reference [147, p. 667, (MP)], where the axiom of the power-set is 
also in question [147, p. 669, (MP)], but did not prove it there from any other axioms. 
Indeed, Max Zorn also stated that he would show its relation and equivalence with 
the Axiom of Choice in “another paper” [147, p. 669, (MP)]. In other words, “Zorn’s 
Lemma” is not in [147]. 

The Axiom of Choice is logically independent from the Zermelo-Frenkel axioms 
of set theory [21, 66, Ch. 5]. Consequently, appending Zorn’s Maximal-Element 
Principle to the Zermelo-Frenkel axioms of set theory yields a larger “Zermelo- 
Freenkel-Choice” set theory that also includes Zermelo’s Well-Ordering Principle 
and the Axiom of Choice. 
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6.3.3 Exercises on Maximality and Well-Orderings 


6.9. Prove that the hypothesis of Zorn’s Maximal-Element Principle 6.32 implies 
that the partially ordered set is not empty. 


6.10. Prove that the conclusion of Zorn’s Maximal-Set Principle 6.34 requires the 
hypothesis that the set be not empty. 


6.11. Translate Zorn’s Maximal-Set Principle 6.34 into a logical formula, similar 
to formula (6.1) for the Choice-Function Principle 6.8. 


6.12. Translate Zermelo’s Well-Ordering Principle 6.24 into a logical formula, 
similar to formula (6.1) for the Choice-Function Principle 6.8. 


6.13. For each partial order < on a set E, let @ C ALE) be the set of all chains 
relative to < in E. Thus each element of @ is a subset of E on which ~ is a linear 
order. Partially order @ by inclusion. Prove that the union of each chain in @ relative 
to inclusion is an element of @. 


6.14. Prove that Zorn’s Maximal-Set Principle 6.34 is logically equivalent to 
Zorn’s Maximal-Element Principle 6.32 within the Zermelo-Frenkel set theory. 


6.4 Unions, Intersections, and Products of Families of Sets 


This section shows that the Choice Principle is equivalent to the distributivity of 
intersections over unions of sets. 


6.4.1 The Multiplicative Principle 


Yet other versions of the Choice Principle rely on Cartesian products, as in 
definitions 6.35 and 6.36. 


6.35 Definition (Cartesian Product). The Cartesian product [],<.¢ Ai of a fam- 
ily of sets {A; : i € -Y} is the set of all functions C: % > Ue, Ai, such that 
C(i) € A; for eachi € F% 


6.36 Definition (Multiplicative Principle). For each nonempty family of 
nonempty sets {A; : i € -¥%}, the Cartesian product [],-., Aj; is not empty [88, 
p. 117]. 


The Multiplicative Principle 6.36 and its negation are formulae but not propo- 
sitions in the Zermelo-Freenkel set theory. However, theorem 6.40 shows that the 
Multiplicative Principle 6.36 is equivalent to the foregoing choice principles. 
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6.37 Theorem. Within the Zermelo-Freenkel set theory, the Family Choice- 
Function Principle 6.12 is logically equivalent to the Multiplicative Principle 6.36. 


Proof. Definitions 6.11 and 6.35 show that the Cartesian product consists of all the 
family choice-functions as its elements. Thus the Cartesian product is nonempty if 
and only if there exists a family choice-function. Oo 


6.4.2 The Distributive Principle 


This subsection shows that the Choice Principle is equivalent to the distributivity 
of intersections over unions of families of sets. The following development follows 
Dugundji’s [30, Ch. I, § 9.7-9.8, p. 25]. 


6.38 Definition (Distributive Principle). For each family of sets {A; : i € 4%} and 
each partition {.% : € € &} of the indexing set .%, which sets up equivalence 
classes in Y, let # := []pcey %; then 


U (Maw) = LJ 4:]. (6.2) 


keH MeL leL \ie% 


6.39 Example. For the self-indexed family ¥ := {A, B, C, D} and the self-indexed 
partition Y := {{A, B}, {C, D}}, the right-hand side of equation (6.2) becomes 


'S E=AUB, 
E€{A,B} 
'o E=CUD, 
E€&{C,D} 


(\{ U4] = @ua)n(CuD). 


leL \ieH 


The Cartesian product of the partitioning equivalence classes is # := {A,B} x 
{C, D} = {(A, C), (A, D), (B, C), (B, D)}, and the left-hand side of equation (6.2) 
becomes 


U (uo) = anquannu@nquenn, 


keH \lEL 


The Distributive Principle 6.38 and its negation are not propositions in the 
Zermelo-Frenkel set theory; there they are mere formulae. Nevertheless, theo- 
rem 6.37 shows that one of the inclusions, but only one, implied by equation (6.2) 
is a theorem in the Zermelo-Frenkel set theory. 
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6.40 Theorem. Within the Zermelo-Frenkel set theory, for each family of sets 
{Aj : i € 4%} and each partition {4% : € € L} of the indexing set 4, which 
sets up equivalence classes in ¥, let H := lev 4; then 


U (a Avo} <(}{ Ua]. (6.3) 


ke HK VEL lEL \iE% 


Proof. This proof unravels the definitions of unions, intersections, and Cartesian 
products: 


Xe rex ((eev Ane) 


definition of (J 
Ak [ke HX) A(X € Nev Ano)] 
ry definition of () 
3k [ke H) A {ve[(l e L) > (X € Aww) ]}] 
i (ke xX) => 


(VEE € Z) = [k(l) € %]}) 
VE (Ee 2) = {Aili € -%) A (X € Aj)]}] 
t definition of ) 
ve iG e2)=> (XeUiey, Ai)| 
definition of () 
Xe(Veg (View Ai) 
oO 


Theorem 6.41 shows that in the Zermelo-Frenkel set theory, the Distributive 
Principle 6.38 is equivalent to the foregoing choice principles. 


6.41 Theorem. Within the Zermelo-Frenkel set theory, the Multiplicative 
Principle 6.36 is logically equivalent to the Distributive Principle 6.38. 


Proof. Reversing the last two equivalences in the proof of theorem 6.40 gives 
xe()\| Ua] | o Velde D = Gillie * a K€ ADB}, 
leL \ie% 
which implies that for every £ € 2, 
(iE A: X EAL AD. 


If the Multiplicative Principle 6.36 holds, then the Cartesian product contains a fam- 
ily choice function k € [[pey {i€ %: X € Aj}, so that k(€) € {ie %: X € Aj}, 
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whence X € Axe, for each £ € &, which is precisely the converse of the 
implication in the proof of theorem 6.40. Thus equation (6.2) holds, and so does 
the Distributivity of Intersection over Union Principle 6.38. 

Conversely, for each nonempty family of nonempty sets {.% : £ € 2}, let “ := 
ley %, and for eachi € Ure y % define A; := {9} or any other singleton. Thus 
for every £ € 2, 


'e Aj = {2}, 


iE Fy 
leL \ie KH 


If the Distributivity of Intersection over Union Principle 6.38 holds, then equa- 
tion (6.2) gives 


U (a suo) = {2}, 


ke HK MEL 


which is not empty, whence neither is # = [|< -%, so that the Multiplicative 
Principle 6.36 holds. oO 


6.4.3 Exercises on the Distributive and Multiplicative 
Principles 


6.15. Translate the Multiplicative Principle 6.36 into a logical formula, similar to 
formula (6.1) for the Choice-Function Principle 6.8. 


6.16. Translate the Distributive Principle 6.38 into a logical formula, similar to 
formula (6.1) for the Choice-Function Principle 6.8. 


6.5 Equivalence of the Choice, Zorn’s, and Zermelo’s 
Principles 


Based on Dugundji’s proof [30, Ch. II, § 2], this section proves that the Choice- 
Function Principle 6.8 implies Zorn’s Maximal-Element Principle 6.32 by means of 
the concept of a “tower” of sets relative to a function. 
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6.5.1 Towers of Sets 


The subsection introduces the concept of “towers” of sets relative to a function. 


6.42 Definition (Tower). [30, Ch. II, p. 32, def. 2.2] A nonempty set 7 C A(X) 
of subsets of a set X is a tower relative to a function F: Y — X if and only if 


(T.A) Be J, 
(T.B) if & C ZT is linearly ordered by inclusion, then |) ¢# € 7, and 
(T.C) ifA € 7, thenAU{F(A)} € J. 


A sub-tower is a nonempty subset .% C 7 that is also a tower relative to the 
restriction Fly: SY > X. 

An element M € .Y is medial in a sub-tower - if and only ifA C Mor MCA 
for every element A € 7. 

A sub-tower is minimal if and only if it does not properly contain any sub-tower. 


6.43 Example. The empty set @ is medial in every tower 7: indeed, @ € 7 by the 
definition 6.42 of towers, and @ C A foreveryA € 7. oO 


6.44 Theorem. For each nonempty set ¥ of towers relative to a common function 
F: U¥ =X, the intersection  := (| F is also a tower relative to F|_g. 

In particular, the set ¥ of all sub-towers of a tower JZ contains a unique smallest 
sub-tower M := (\F. 


Proof. This proof verifies that. # = (| F satisfies the definition 6.42 of towers. 


(T.A) From @ € J forevery 7 € ¥ follows 9 € (| F = 4. 

(T.B) If W C.@, then & C F forevery J € F; if & is also linearly ordered 
by inclusion, then |). € 7 for every 7 € F, whence |) # €() F = MH. 
(T.C) IfA € .@, then A € J forevery 7 € F, whence A U {F(A)} € JF for 

every 7 € ¥,andhenceA U {F(A)} €() F =.@. 


In particular, the set ¥ of all sub-towers of a tower 7 contains the sub-tower 
M=(\F CUF=Z, because FJ € ¥ and. Y C TF for every Y € F. 
Also, #@ =()\F C - for every Y € F, so that 7H is the smallest sub-tower. O 


6.45 Theorem. [f M € .@ is medial in the minimal sub-tower of a tower 7, 
then either AC M or MU {F(M)} CA for every element A € @. 


Proof. This proof verifies that the set Y := {Ae .@: (ACM)v(MU{F(M)} Cc 
A)} is also a sub-tower of the smallest sub-tower 7. 


(T.A) @ € Y because @ C M regardless of M. Thus Y 4 ©. 

(T.B) If is a chain in Y, then foreachA € & CY, either MU {F(M)} CA 
or A C M. Hence two cases can arise: 
In the first case, if there exists any A € & such that M U {F(M)} C A, then 
MU{F(M)}} CACUZ. 
In the second case, if there does not exist any A € # such that MU{F(M)} C A, 
then A C M for every A € & C FY by definition of .%, whence J # C M. 
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In either case |) Z € Y. 

(T.C) For each A € .Y, either M U {F(M)} C A orA C M, by definition of .%. 
In the first case, if MU {F(M)} C A, then MU {F(M)} CA CAU {F(A)}. 
In the second case, if A C MandA € Y C.@, then A U {F(A)} € -@ because 
Md is a tower, whence either M U {F(M)} C A U {F(A)} or A U {F(A)} C M 
because M is medial in .@#. Thus A U {F(A)} € 7%. 


Thus .Y is a sub-tower of the smallest sub-tower .#. Hence .Y = .@, so that 
AC MorMU{F(M)} CA for each medial element M € @andeachAe #. O 


6.46 Theorem. Every element of the smallest sub-tower of any tower is medial in 
the smallest sub-tower. The smallest sub-tower is linearly ordered by inclusion. 


Proof. This proof consists of verifying that the set Y of all medial element of Z is 
a sub-tower of 7%. 


(T.A) @ € V because @ is medial in .@ by example 6.43. 

(T.B) If Y C V isachainin Y and B € YW C Y, then B is medial in .Z, so 
that BU {F(B)} CA orA C B for each A € .@, by theorem 6.45. 
In the first case, if there exists any A € Y such that B U {F(B)} C A, then 
BU{F(B)} CACUY. 
In the second case, if there does not exist any A € Y such that BU {F(B)} C A, 
then A C B for every A € Y C ¥, by definition of Y whence |) Y C B. 
In either case J YW € ¥. 

(T.C) Every A € ¥ is medial in .@ by definition of ¥. By theorem 6.45 for 
every B € .@ either B C A, in which case B C A C AU{F(A)}, or AU{F(A)} C 
B. In either case A U {F(A)} is medial in ./, so that A U {F(A)} € Y. 


Thus ¥Y is a sub-tower of the minimal sub-tower .@. Hence ¥ = .@, so that every 
element of .@ is medial in .@. Hence -@ is linearly ordered by inclusion. Oo 


6.47 Theorem. For every tower FZ C P(X) relative to a function F: F > X 
there exists A € ZF such that F(A) € A. 


Proof. Let @ be the smallest sub-tower, and let A := ).W. Then A € .@ by 
definition 6.42 of a tower and because .W% C .@ and -@ is linearly ordered, by 
theorem 6.46. Hence also A U {F(A)} € -@ by definition 6.42 of a tower, whence 
AU {F(A)} C L)-@ =A, and hence F(A) € A. Oo 


6.5.2 Zorn’s Maximality from the Choice Principle 


This subsection shows that Zorn’s Maximality Principle follows from the Choice 
Principle. 


6.48 Theorem. The Choice-Function Principle 6.8 implies Zorn’s Maximal- 
Element Principle 6.32. 
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Proof. For each set X with a reflexive partial order < let @ be the set of all chains 
relative to < in X. 

For each nonempty chain C € @, let Uc be the set of all upper bounds for C 
relative to x in X. The hypotheses of Zorn’s Maximal-Element Principle 6.32 imply 
that Uc # ©. 

If the Choice-Function Principle 6.8 holds, then there exists a function 
F: {Uc: C € @} > X such that F(Uc) € Uc, which chooses one upper bound 
F(Uc) for each nonempty chain C € @. Hence define Dc := {K € X: [F(Uc) X 
K] A {-[K < F(Uc)]}}, which is the set of all the elements of X that strictly follow 
the upper bound F(Uc). If there exists C € @ such that Dc = @, then F(Uc) is a 
maximal element of X. 

Alternatively, if Dc # @ for every C € @, then, again by the Axiom of 
Choice 6.8, there is a function G: {Dc : C € @} — X such that G(Dc) € Dc 
for each C € @. The hypotheses of Zorn’s Maximal-Element Principle 6.32 also 
imply X 4 @, so that there exists Z € X. Define H(@) := Z and H(C) := G(Dc). 
The remainder of the proof verifies that 7 := @ U {@} is a tower relative to H. 


(T.A) @ € J by the definition 7 := @ U {@}. 

(T.B) If Y is a chain in 7 relative to inclusion, then (_) 7 is a chain relative to 
< in X. Indeed, for all K,L € ()-Y there exist A,B € .Y such that K € A and 
L €B. Also, 7 is linearly ordered by inclusion, so that A C B or B C A. Hence 
K,L€ Bor K,L €A. Moreover, A,B € .Y C JF are chains relative to < in X, 
whence A and B are linearly ordered, and hence K <x Lor L < K, in BorA. 
Thus |). Y% € 7. 

(T.C) If C € 7, then H(C) is an upper bound for C relative to < in X, whence 
C U {H(C)} is also a chain relative to < in X. Hence CU {H(C)} € J. 


By theorem 6.47 there is a chain C € 7 such that H(C) € C. Hence H(C) < 
F(Uc) because F(Uc) is an upper bound for C. Yet H(C) € Dc is a strict upper 
bound with —[H(C) < F(Uc)]. This contradiction completes the proof: there is a 
chain C € @ for which Dc = @, so that F(Dc) is maximal relative to < in X. oO 


Zermelo’s Theorem [144] consists of the logical implication that the Well- 
Ordering Principle follows from the Axiom of Choice within the Zermelo-Frenkel- 
Choice set theory. 

Zorn’s Lemma consists of the logical implication that Zorn’s Maximal-Element 
Principle follows from the Axiom of Choice within the Zermelo-Frenkel-Choice 
set theory. 


6.49 Definition (Interval). In a set E pre-ordered by a relation < a subset S C E 
is an interval if and only if for all U, W € S and every V € E, if U < V X W, then 
VeS. 


6.50 Example. The empty subset @ and the whole set E are intervals. 
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6.5.3 Exercises on Towers of Sets 


6.17. Define F: A(N) — N by F(@) := 0 and F(A) := min(A) for A 4 @. 
Prove that the set 7 of all intervals of N is a tower relative to F. 


6.18 . For each set E well-ordered by a reflexive relation < define F: P(E) > E 
by F(@) := min(£) and F(A) := min(A) for A # @. Prove that the set 7 of all 
intervals of E is a tower relative to F. 


6.19. Prove that A(E£) is a tower relative to any function A(E) > E. 


6.20. Prove or disprove that every union of towers is a tower. 


6.6 Yet Other Principles Related to the Axiom of Choice 


This section states principles that are related to the Axiom of Choice, in particular, 
in the sense that they are not propositions within the Zermelo-Frenkel set theory. 


6.6.1 Yet Other Principles Equivalent to the Axiom of Choice 


This subsection states principles that are equivalent to the Choice Principle in the 
Zermelo-Frenkel set theory. 


6.51 Definition (Hausdorff’s Particular Maximal-Chain Principle). In a set of 
sets partially ordered by inclusion, each chain is a subset of a maximal chain [128, 
p. 248, Hi]. 


6.52 Definition (Hausdorff’s Maximal-Chain Principle). Each set of sets has a 
maximal chain relative to inclusion [88, p. 118], [128, p. 248, Ha]. 


6.53 Definition (Hausdorff’s Maximal-Subset Principle). In each partially 
ordered set there is a maximal linearly ordered subset [56, 57, p. 339], [91, p. 69]. 


6.54 Definition (Zermelo’s Principle). For each partition ¥ of a set (of sets) A, 
there exists a subset C C A such that CN B is a singleton for each B € F [88, 
p. 117]. 


6.55 Definition (Counting Principle). For each set E there is an ordinal O and a 
bijective function F : O > E with domain O [88, p. 117], [128, p. 241]. 
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6.56 Definition (Kuratowski’s Maximal-Order Principle). For each partial 
order R and for each linear order L C R there exists a maximal (relative to inclusion) 
linear order M such that L C M C R [88, p. 118]. 


6.57 Definition (Trichonomy Principle). For all sets A and B, there exists an 
injection F with F : A — B with domain A, or F : B — A with domain B [88, 
p. 118], [128, p. 241]. 


6.58 Definition (Mapping Principle). For all nonempty sets A and B, there exists 
a surjection F with F: A —> B, or F: B—> A [88, p. 118]. 


6.59 Definition. A set (of sets) A is of finite character if and only if 


(FC.A) A # @, 
(FC.B) if X € A and B C X is finite, then B € A, and 
(FC.C) for each set E, if every finite subset of E is a member of A, then E € A. 


6.60 Definition (Teichmiiller-Tukey Maximal-Element Principle). Every set of 
finite character has a maximal element [128, p. 249]. 


6.6.2 Consequences of the Axiom of Choice 


This subsection states principles that follow from the Choice Principle in the 
Zermelo-Frenkel set theory. 


6.61 Definition (Finite Sets and Infinite Sets). A set E is finite if and only if there 
is N € N anda bijection F : N — E; it is infinite if and only if it is not finite. 


6.62 Theorem. In the Zermelo-Frenkel-Choice set theory, every infinite set con- 
tains a denumerable subset. 


Proof. For each set E, let & be the set of all injections from any subset of N into E. 
Thus each element F € & is a subset of Nx E , and & C A(N x E) is partially 
ordered by inclusion. If E is infinite, then for each N € N there exists an injection 
F: N ~ E, which is also a subset of N x E. In particular & 4 ©. 

Each chain ¥ C & defines an injection |) .¥ € &. Indeed, for each (X,Y) € 
) F, there exists F € ¥ such that (X, Y) € F. Ifalso (X,Z) € Ge F¥, thenF CG 
or G C F, because ¥ is a chain, whence Y = Z. Moreover, if (U,Y), (X,Y) € 
) F, then there exist F, G € ¥ such that (X, Y) € F and (U, Y) € G, with F CG 
or GC F, whence U = X by injectivity of F or G. Thus |) ¥ is an upper bound for 
F and |) F € &. By Zorn’s Maximal-Element Principle 6.32 the set & contains a 
maximal element F’. 

If the domain of F was finite, then a reparametrization of its domain would give 
an injection G: N ~~ N, which would not be injective, by definition of infinite. 
Hence the domain of F is an infinite subset of N. 
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Therefore, the image of F is a denumerable subset of E. Oo 
Without the Axiom of Choice, theorem 6.62 fails [66, p. 141]. 


6.63 Definition (Dedekind Infinite Sets). A set D is Dedekind infinite if and only 
if there exist a proper subset B ¢ D and an injection F: D — B. A set is Dedekind 
finite if there are no injections from it into any of its proper subsets. 


6.64 Theorem. In the Zermelo-Frenkel set theory, every Dedekind infinite set is 
infinite. 


Proof. By contraposition, if a set is finite, then there are no injections from it into 
any of its proper subsets, whence it is Dedekind finite. Oo 


6.65 Theorem. In the Zermelo-Frenkel-Choice set theory, every infinite set is 
Dedekind infinite. 


Proof. If a set E is infinite, then by theorem 6.62 it contains a denumerable subset 
D C E. In particular, there is a bijection C: N — D. Define B := E \ {C(0)}, and 
F: E + B with F(X) := X for every X € E \ D and F[C(N)] := F[C(W + 1)] if 
X = C(N) € D. Then F is an injection from E to B. Oo 


6.66 Definition (Countable Axiom of Choice). Each countable set of nonempty 
sets has a choice function. 


Proofs of the equivalence of compactness and sequential compactness, or 
continuity and sequential continuity — and hence of the existence of extrema of 
continuous functions on closed and bounded intervals — invoke the Countable 
Axiom of Choice [66, p. 21, § 2.4.3] but do not require (the results do not logically 
imply) the Axiom of Choice [111, p. 178]. 


6.6.3 Exercises on Related Principles 


6.21. Prove that in the Zermelo-Frenkel-Choice set theory every denumerable 
union of pairwise disjoint denumerable sets is denumerable. 


6.22. Following the R. L. Moore method [15, 146], prove all logical implications 
between all “principles” stated in the present chapter. 
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Chapter 7 
Applications: Nobel-Prize Winning Applications 
of Sets, Functions, and Relations 


7.1 Introduction 


This chapter shows concrete applications of sets, functions, and relations: 


1. 


Arrow’s Impossibility Theorem. Kenneth J. Arrow received the Nobel Prize in 
Economic Science in 1972, mainly for his Impossibility Theorem, from work at 
the RAND Corporation in 1948 [4, p. 328, footnote 1]. 


. Gale and Shapley’s Matching Algorithm. Gale and Shapley’s Ph.D. advisor was 


Princeton’s Albert William Tucker; Lloyd S. Shapley received the Nobel Prize in 
Economic Science in 2012, for work that can be traced back to a lecture by John 
von Neumann in 1948 at the RAND Corporation [68, p. 384]. 


. Nash’s Equilibrium. Nash’s Ph.D. advisor was Princeton’s Albert William 


Tucker; John Forbes Nash, Jr., received the Nobel Prize in Economic Sciences 
in 1994, for work that can be traced to Melvin Dresher and Merrill Meeks Flood 
in 1950 at the RAND Corporation [12, 124, 125]. He received the Abel Prize in 
2015. 


David Gale and Lloyd S. Shapley pointed out that readers without a technical 


background may miss the logic in their Nobel-Prize winning work even though it is 
written in plain English: 


“Yet any mathematician will recognize the argument as mathematical, while people without 
mathematical training will probably find difficulty in following the argument” — [41, 
p. 391). 


This chapter shows what mathematicians recognize in the argument: links between 
applications and mathematical sets, functions, and relations. Expositions considered 
concise and elegant by mathematicians are listed in the references. 
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7.2 Game Theory 


The mathematical analysis of games, called game theory, explains why and 
predicts how people will rationally make decisions against their interest. Examples 
include decisions whether to arm one’s self. For their contributions to game theory, 
John Harsanyi, Reinhard Selten, and John Forbes Nash, Jr., received the Nobel 
Prize in Economics in 1994 [93-95]. This introduction to game theory relies on 
mathematical functions and ordering relations. 


7.2.1 Introduction 


Publicized by Albert W. Tucker as A Two-Person Dilemma [131], but now known 
as The Prisoner’s Dilemma [132], a prototype of the subject of game theory 
originated through the work of Melvin Dresher and Merrill Meeks Flood in 1950 
at the RAND Corporation [12, 124, 125]. Similar games explain the nuclear arm 
race (see Figure 7.1) [13, 126]. 


A 


— a 


Fig. 7.1 Early days of the nuclear arm race. Left: a Soviet modified SS-6 (Sapwood) with 
Sputnik 2 on 3 November 1957, photograph courtesy NASA. Right: an Atlas rocket lifts off, 
photograph courtesy U.S. Air Force. (http://www.history.nasa.gow/SP-4408pt1.pdf) (http://www. 
nationalmuseum.af.mil/shared/media/photodb/photos/050406-F- 1234P-014.jpg) 
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7.1 Example (The Prisoner’s Dilemma [131, 132]). The police charge Al and Bo 
with a crime and hold them in separate cells, so that Al and Bo cannot communicate 
with each other. Al and Bo believe that they have at least a first strategy: 


(PD.1) If Al and Bo do not confess to the crime, then they both go free. 
However, the police gives each of them another strategy, to confess: 


(PD.2) If both Al and Bo confess to the crime, then each gets sentenced to prison. 
(PD.3) If one confesses but the other does not confess, then whoever confesses 
gets a reward and goes frees, while whoever does not confess gets a death 
sentence. 


Al and Bo’s dilemma is whether to confess or not to confess. 


The Prisoner’s Dilemma also applies to two countries, or their rulers, or any two 
individuals, Al and Bo, deciding whether to cooperate not to arm themselves, or to 
defect the agreement of cooperation and arm themselves. The Prisoner’s Dilemma 
predicts that they will both arm themselves, as shown in figure 7.1, and explains 
why [13, 126], as demonstrated in subsection 7.2.2. Game theory also explains 
animal behavior, for example musth in male African elephants [87]. To get a sense 
of Nash’s work, the reader may at this stage try to design a method to analyze 
mathematical games such as The Prisoner’s Dilemma but with any numbers of 
players and strategies. 


7.2.2 Mathematical Models for The Prisoner’s Dilemma 


This subsection describes ways to design a mathematical model and analyze games 
such as The Prisoner’s Dilemma. 


7.2 Example (The Prisoner’s Dilemma, continued [131, 132]). What Al and Bo 
eventually get (reward, freedom, prison, or death) is called their payoff. The first 
step in designing a mathematical model of the game consists in modeling the 
players’ preferences with ordering relations on each player’s set of payoffs. In this 
example, it seems reasonable to assume that Al and Bo rank all four payoffs in the 
same order of preference, with > meaning “better than”’: 


reward & freedom > freedom > prison > death. 


The second step in designing a mathematical model of the game consists in 
modeling every way the players may play by organizing their actions, called 
strategies, and corresponding payoffs in a table. Table 7.3 summarizes Al’s and 
Bo’s strategies and their resulting payoffs in The Prisoner’s Dilemma. Al has only 
two strategies available: either to confess or not to confess. Similarly, Bo may either 
confess or not confess. Moreover, The Prisoner’s Dilemma is a noncooperative 
game in the sense that neither player knows the other player’s action in advance. 
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The analysis of the game then examines every combination of strategies in the 
table. For instance, Al may examine each of Al’s own strategies and seek to avoid 
the worst payoff. 


Al confesses. If Al confesses, then meanwhile two cases can occur: Bo can either 
confess or not confess. 


Bo confesses. If Bo also confesses, then Al faces a prison term, 
Bo does not confess If Bo does not confess, then Al goes free with a reward. 


Thus, if Al confesses, then the worst that can happen to Al is a prison term. 
Al does not confess. Similarly, if Al does not confess, then two cases can occur: 
Bo can either confess or not confess. 


Bo confesses. If Bo confesses, then Al faces death. 
Bo does not confess _ If Bo does not confess either, then Al goes free. 


Thus, if Al does not confess, then the worst that can happen to Al is death. 


Consequently, to avoid death with certainty but without knowing Bo’s action, Al 
must confess. 

Table 7.3 is symmetric in the sense that swapping Al’s and Bo’s réles does not 
change their payoffs. Thus Al’s reasoning also applies to Bo, who must also confess 
to avoid death with certainty. 

Therefore, to avoid death with certainty, Al and Bo both confess, even though 
they would both be better off by not confessing. 


7.4 Example (Arm race [13, 126]). Tf two individuals or countries cooperate not to 
arm themselves, then they remain free and safe. If both defect and arm themselves, 
then they remain safe but incur a penalty with the cost of armament. If either 
one defects and arms itself while the other one does not arm itself, then whoever 
arms itself stays free and gets a reward by conquering and looting the other, who 
suffers from death, destruction, and loosing the property to the conqueror. The 
ordering of the outcomes just described are the same as those in table 7.3 for 
The Prisoner’s Dilemma. Therefore, the analysis of The Prisoner’s Dilemma in 
example 7.2 predicts and explains why they will both arm themselves. 


Example 7.5 shows another way to analyze The Prisoner’s Dilemma. 


Table 7.3 Payoffs for the 
Prisoner’s Dilemma. 


BO’S STRATEGIES 


CONFESS NOT CONFESS 

PAYOFFS TO PAYOFFS TO 

AL Bo AL Bo 
CONFESS prison | prison | reward | death 


AL’S STRATEGIES 
NOT CONFESS death | reward | freedom | freedom 
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7.5 Example (The Prisoner’s Dilemma, continued [131, 132]). In contrast to 
example 7.2, Al may focus first on Bo’s strategies and seek to get the best payoff. 
Again, Bo may either confess or not confess. 


Bo confesses. If Bo confesses, then Al has two choices. 


Al confesses. If Al confesses, too, then Al gets a prison term. 
Al does not confess. If Al does not confess, then Al gets death. 


Thus, if Bo confesses, then Al gets the better of the two payoffs available (prison 
or death) by confessing. 
Bo does not confess. If Bo does not confess, then Al has two choices. 


Al confesses. If Al confesses, then Al goes free with a reward. 
Al does not confess. If Al does not confess, then Al goes free. 


Thus, if Bo does not confess, then Al also gets the better payoff by confessing. 


Thus, regardless of Bo’s action, Al gets the better payoff available by confessing. 
Therefore, Al confesses. The same analysis applies to Bo, who also confesses. 


7.2.3 Dominant Strategies 


Some of the modeling and analysis of The Prisoner’s Dilemma carry over to other 
games. To this end, this subsection shows that a special type of strategy contains an 
equilibrium, from which no players has any incentive to switch to another strategy. 
For a game with exactly two players, Al and Bo, assume that Al has a nonempty set 
of strategies S,, and a nonempty set of payoffs P,,, while Bo has a nonempty set of 
strategies S,, and a nonempty set of payoffs P,,. Assume also that Al and Bo have a 
linear order “>” on their sets of payoffs. If Al plays a strategy A € S,, and Bo plays 
a strategy B € S,,, then Al gets a payoff p', while Bo gets a payoff p%°,. Thus for 
each Ci € {Al, Bo}, p“ is a function p“ : S,, x $3, > Pc. Table 7.6 shows the payoff 
table of a game for two players with two strategies. 


7.7 Definition (dominant strategy). A strategy is weakly dominant for a 
player (Al, for instance) if and only if, for each fixed combination of the other 
players’ strategies, that player (Al) cannot get a higher payoff by switching to 


Table 7.6 Payoffs for RO’s STRATEGIES 
noncooperative two-person 


games with two strategies. a i ES Sa 


PAYOFFS TO | PAYOFFS TO 
AL |Bo AL | Bo 


T (“Top”) Pry |Pt | Ptr | Pre 
AL’S STRATEGIES 
Al 


B (“BOTTOM”) PRL PRL Par Par 
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another strategy. A strategy is strongly dominant for a player (Al) if and only if, 
for each fixed combination of the other players’ strategies, that player (Al) always 
gets a lower payoff by switching to another strategy. 


7.8 Example. In The Prisoner’s Dilemma (examples 7.1 and 7.5), Al’s strongly 
dominant strategy is to confess: If Bo confesses, then Al gets a higher payoff by 
confessing (prison) than by not confessing (death). Similarly, if Bo does not confess, 
then Al also gets a higher payoff by confessing (freedom and a reward) than by 
not confessing (freedom). By symmetry of The Prisoner’s Dilemma, Bo’s strongly 
dominant strategy is also to confess. 


7.9 Definition (dominant strategy equilibrium). A position is an array of strate- 
gies, with one strategy for each player. A weakly (respectively strongly) dominant 
strategy equilibrium is a position where each player plays a weakly (respectively 
strongly) dominant strategy. 


7.10 Example. In The Prisoner’s Dilemma, the position where Al and Bo both 
confess is a dominant strategy equilibrium, where their dominant strategies meet. 
Neither Al nor Bo has any incentive to switch to another strategy (not to confess), 
because whoever decides not to confess while the other still confesses faces death. 


7.11 Definition (Nash equilibrium). A Nash equilibrium is a position from 
which no players can get a higher payoff by changing only their own strategy, while 
all the other players’ strategies remain fixed [93, p. 287]. 


7.12 Example. In The Prisoner’s Dilemma, the position where Al and Bo both 
confess is a Nash equilibrium: If Al decides not to confess but Bo still confesses, 
then Al’s payoff drops. Similarly, if Bo decides not to confess but Al still confesses, 
then Bo’s payoff drops. In contrast, the position where neither Al nor Bo confess is 
not a Nash equilibrium, because whoever decides to confess gets a higher payoff if 
the other one still does not confess. 


7.13 Theorem. Every weakly dominant equilibrium is a Nash equilibrium. 


Proof. Table 7.14 shows the conditions for either of Bo’s strategic to be weakly 
dominant in a game with only one other player (Al). 


Table 7.14 Bo’s potentially weakly dominant strategies. 


BO’S STRATEGIES 


LIS WEAKLY RIS WEAKLY 
DOMINANT: OR | DOMINANT: 
PAYOFFS TO PAYOFFS TO 
Bo Bo Bo Bo 
T pr |2 | Pre Pr |S | Pre 
AL’S STRATEGIES AND OR AND 


Bi B Bi Bi 
B Pa. (2 Par Pau |S Par 
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If R is Bo’s single weakly dominant strategy (Bo’s R column in table 7.14), then 
Bo plays R. Indeed, if Al plays T , then R gets Bo a larger payoff, because pi, < p*".; 
if Al plays B , then R also gets Bo a larger payoff, because p;’, < p}’,. 

Since Al knows the game, Al knows that R is Bo’s single weakly dominant 
strategy. Thus, Al knows that Bo will play R. Consequently, Al has only two payoffs 
available: p*', or p3',. Therefore, Al chooses a strategy that yields the larger available 
payoff. For instance, if p*', < p;',, then Al plays B. 

Hence the game ends at (B,R) with payoffs p;', to Al and p;°, to Bo. From (B,R) 
Bo cannot get a higher payoff by switching to L because p;’, < p}°,. From (B,R) Al 
cannot get a higher payoff by switching to T because pi, < p;',. Thus (B,R) is a 
Nash equilibrium. 

If R and L are both weakly dominant strategies for Bo, then Bo’s payoffs are 
identical in R and L. If p}', is the largest of Al’s four payoffs in Bo’s weakly 
dominant strategies, then (B,R) is still a Nash equilibrium. Yet Al no longer knows 
which of R or L Bo chooses: the game might not end at a Nash equilibrium. oO 


7.2.4 Mixed Strategies 


Some games or situations might not occur more than once in the players’ life time, 
for instance, The Prisoner’s Dilemma. Yet other games or situations may occur more 
than once in the players’ life time. Each player may then adopt a mixed strategy, 
defined by choosing one strategy some of the time and another strategy at other 
times. Each player’s payoff may then be a weighted average of the payoffs from 
the different strategies in the mixed strategy. For emphasis, strategies that are not 
mixed, for instance, as in The Prisoner’s Dilemma, are called pure strategies. 


7.15 Example (The Battle of the Sexes [92]). Al and Bo would like to go to a 
show together. Al prefers a Jazz concert while Bo prefers a play at the theater. 
Nevertheless, they would prefer to go to the same show together, rather than to 
different shows separately. After some negotiation they agree on a show. On the 
day of the show, however, they each forget which show they had agreed on, but 
cannot communicate with each other. So they each must decide to which show to 
go, without knowing in advance where the other will go. Table 7.16 shows their 
payoffs. In hope of finding Bo, Al goes to the Jazz concert one half of the time, and 


Table 7.16 Payoffs for the 


BO GOES TO 
Battle of the Sexes. Tazz 0) PLay (P) 
PAYOFFS TO | PAYOFFS TO 
AL | Bo AL |Bo 
JAZZ (J) | 3 2 1 1 


AL GOES TO 
PLAY (P) | 0 0 2 3 
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Table 7.17 Payoffs for the Battle of the Sexes over four days. 


BO GOES TO 

JAZZ (J) PLAY (P) 

PAYOFFS TO PAYOFFS TO 

AL Bo | AL Bo 


JAZZ (J) | 3 (Day 1) | 2 1 (Day 3) | 1 
AL GOES TO 
PLAY (P) | 0 (DAY 2) |0 2 (Day 4) | 3 


to the play at the theater the other half of the time. In hope of finding Al, Bo does 
the same. Table 7.17 shows the four combinations of strategies; for this example, 
assume that they occur equally frequently. Thus Al’s and Bo’s average payoffs over 
four days are 


3+04+1+2 3 
#C/2,1/2) = a = 
4 2 

2+04+1+43 3 

p(1/2, 1/2) = — - = =5. 


Al now decides to go to the Jazz concert every day, while Bo still goes to Jazz every 
other day and to the play every other day. There are then only two kinds of days, 
and their average payoffs become 


341 
p'(,0) = = = 2: 
2+1 3 
Bo(1 /2,1/2) = ——- = -. 
p’(1/2, 1/2) ; 


Hence arises the question whether other mixed strategies can increase both Al’s 
and Bo’s payoffs. 


7.2.5 Existence of Nash Equilibria for Two Players 
with Two Mixed Strategies 


This section shows that if neither player has any weakly dominant pure strategy, then 
the game still has a Nash equilibrium for some combination of mixed strategies. 


7.18 Definition (mixed strategy). The same players might play multiple rounds 
of the same game and independently choose any available strategy for each round. 
Over M rounds, Al may play T exactly T times and B exactly B times, with T + 
B = M. Thus AI plays T with a frequency t = T/M and B with a frequency 
b = B/M = 1 —t. The ordered pair of frequencies (t, 1 — 1) is called a mixed 
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strategy for Al. Meanwhile, Bo may play L exactly L times and R exactly R times, 
with L+R = M. Thus Bo plays L with a frequency € = L/M and R witha frequency 
r= R/M = 1 —£. Thus (€, 1 — £) is a mixed strategy for Bo. Mixed strategies are 
also subject to the condition that Al and Bo play (T,L) with frequency r- € and hence 
also receive payoffs p™, and p®* with frequency f-£. Similar products of frequencies 
apply to the other plays (T,R), (B,L), and (B,R). The sum of such payoffs to Al and 
Bo are 


Peet, Fe =o) ela) ee + = Oe 
pr(t,€) =t-l-pe +4-(L—£)-p®, + (1-2): + pe + -1)- (1-2) - pe, 


7.19 Theorem. Every game for two players with two pure strategies has a Nash 
equilibrium. 


Proof. For Bo’s payoff, collecting similar powers of f and £ leads after some algebra 
to the equivalent formula 


P(t, 0) = €: kt [Or — Pre) + sie — Pod] — Ose — Pa} +t Pre — Pow) + Pore: 


Al has a problem: if Al can choose a frequency ¢* such that 


+i =P we. =2, = eae = 0; 


then Bo’s payoff becomes 


Pet’, £) = f+ (Dre — Phx) + Phe? 
which does not depend on @ and thus strips away from Bo all controls over Bo’s 
own payoff. In particular, Bo cannot get a higher payoff by switching to a different 
frequency ¢. Yet if Bo can choose a frequency £* so that Al cannot control Al’s own 
payoff, then they are at a Nash equilibrium (¢*, £*). The following considerations 
show that such a Nash equilibrium exists provided that neither Al nor Bo has any 
weakly dominant strategy. Table 7.20 shows the conditions for Bo not to have any 
weakly dominant strategy, obtained from the logical negation of Table 7.14. Two 
cases emerge: either Pos < pe, and pr, > py., Or Pr; | < pr, and pi, > pea. 

In the first case, py, < py’, and pi > pi,. Hence p;°, — p;, > O and 


Bo 


Pe. — Pye, > 0. Consequently, 


re Deal + Gag Ui) (Pre Deg) =U 


Thus, Al can choose a frequency t* between 0 and | where Bo has no controls over 
Bo’s payoff: 
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Table 7.20 Conditions for Bo to have no weakly dominant pure strategies. 


BO’S STRATEGIES 


LIS NOT WEAKLY R IS NOT WEAKLY 
DOMINANT: AND | DOMINANT: 
PAYOFFS TO PAYOFFS TO 
Bo Bo Bo Bo 

T |pr |< | Pre Pr |> | Ptr 

AL’S STRATEGIES OR AND OR 
B | pp |< | Pp Pou [> [Per 
Bo Bo 
=f, .) 
1 > t* = B,R BL > 0. 


Bo __ pB Bo __ 7B 
TL Pe) + BAR Pao) 


Similarly, if Al has no weakly dominant strategies, then Bo can choose a frequency 
£* between 0 and 1 where Al has no control over Al’s payoff, because 
re she — Pa) 


al — pal) py 
BL Pr BR Pre 


With such frequencies, 


Pe eae pF OH!) Ps 
p(t", e*) = t* - pe, + 1-2") - pe. 


Neither Al nor Bo can change their own payoff by switching to another frequency 
while the other player’s frequency is fixed. Thus (¢*, €*) is a Nash equilibrium. The 
second case, where pi’, < p?*, and p}*, > p>’, is entirely similar. 

If a player has a dominant strategy, then such a strategy contains a Nash 


equilibrium, by theorem 7.19. oO 


For Nash’s equilibria with two players and two strategies, see also 
[106, p. 138-139], [107], [119, p. 155-168]. 


7.21 Example (The Professors’ Problem). Al and Bo are on the tenured faculty 
at King Game’s College. Each may either teach students or sit on committees. 
Teaching does not hamper any one’s work, but committees hamper other faculty 
members’ work, to the extent summarized by Al’s and Bo’s end-of-the-year bonus 
payoffs in table 7.22. The sum of Al’s and Bo’s payoffs reflects the value of their 
work to the College. Al and Bo know table 7.22 but work in different offices and 
thus do not cooperate with each other; hence they must choose a strategy without 
knowing in advance what the other is doing. 

Bo has exactly one weakly dominant strategy, which is to sit on committees; 
consequently, Bo declines to teach and decides to sit. Al knows table 7.22 and thus 


www.pdfgrip.com 


7.2 Game Theory 313 


Table 7.22 Payoffs for The 
Professors’ Problem. 


BO’S STRATEGIES 


TEACH SIT 
PAYOFFS TO | PAYOFFS TO 
AL | Bo AL | Bo 
TEACH | 6 9 2, 9 
AL’S STRATEGIES 
SIT 5 1 3 2 


Table 7.23 Payoffs for The 
Administration’s Response. 


BO’S STRATEGIES 


TEACH SIT 
PAYOFFS TO | PAYOFFS TO 
AL | Bo AL | Bo 
TEACH | 4 9 4 5 
AL’S STRATEGIES 
SIT B) 1 3 2 


knows that Bo will sit on committees; therefore, to get the higher payoff available 
to Al under Bo’s decision to sit, Al also declines to teach and decides to sit on 
committees. 

Not only do Al and Bo choose the Nash equilibrium where they are both worse 
off than in the other Nash equilibrium, but the sum of the values of their work to the 
College is the worst of all possibilities. 

The administration’s challenge is to entice the faculty to teach by modifying the 
payoff table. 

Table 7.23 shows the new game on campus after the administration capped 
payoffs from committees to 5 units. If Bo sits on committees, then Al may also 
sit for a payoff of 4 rather than teach for 6: Al now values teaching only up to 4. 


To get a deeper sense of one of Nash’s many contributions, the reader may 
attempt proving the existence of a Nash equilibrium in games for any number of 
players with any number of strategies. 


7.24 Example (Blue against Red [26]). In Melvin Dresher’s initial context, the 
Blue and Red commanders led opposing armed forces [26, p. 4], but they might also 
lead sports teams [12]. Table 7.25 shows only the payoffs to the Blue commander. 


7.2.6 Exercises on Mathematical Games 


7.1. Identify all the weakly dominant pure strategies in The Battle of the Sexes 
(table 7.16 in example 7.15). 
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Table 7.25 Payoffs to Blue for a noncooperative two-commander game with 
three strategies (adapted from [26, p. 4]). 

RED’S STRATEGIES 

BET | CCEN IEE) Eee) 


T (“TOP”) Failure Success Success 
BLUE’S 

M (“MIDDLE”) | Draw Success Draw 
STRATEGIES 

B (“BOTTOM”) | Success Draw Failure 


7.2. Identify all weakly dominant pure strategies in the Administration’s Solution 
to the Professors’ Problem (table 7.23 in example 7.21). 


7.3. Identify all Nash equilibria with pure strategies in The Battle of the Sexes 
(table 7.16 in example 7.15). 


7.4. Identify all Nash equilibria with mixed strategies in the Administration’s 
Solution to the Professors’ Problem (table 7.23 in example 7.21). 


7.5. Prove or disprove that every Nash equilibrium is a dominant strategy 
equilibrium. 
7.6. Prove or disprove that every two-player game restricted to pure strategies has 


a Nash equilibrium. 


7.7. For each function f : A x B — C such that C is linearly ordered and for each 
nonempty subset E C A x B the image f”(E) has a first element min[f” (E)] and a 
last element max[f” (E)], prove that 


max{min{f(x, y): y € B}: x € A} < min{max{f(x, y): x € A}: y © B}. 


7.8. Design a function f : A x B — C such that C is linearly ordered and for each 
nonempty subset E C A x B the image f” (E) has a first element min[f” (E)] and a 
last element max[f” (E)], with a strict inequality 


max{min{f(x, y): y € B}: x € A} < min{max{f(x, y): xe A}: y € B}. 
Denote by “Al’s strategies” and “Bo’s strategies” the set of all strategies available 


to Al and Bo respectively. 


7.9. Prove that if each player plays so as to avoid the worst payoff available, then 
they get the payoffs 


| Sea 2 Spo € BO’S strategies} : S,, € Al’s strategies}, 


payoff to Al = max{min{p§ 


payoff to Bo = max{min{ps’ 5, : Sa € Al’s strategies} : S,, € Bo’s strategies}. 
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7.10. Assume that in a game for two players with two strategies ps. 5, > Py vy 
ee AL Bo 

5, for all positions (Sq, S,,) and (S',, Si). Prove that if 

each player plays so as to avoid the worst payoff available, then they get the payoffs 


if and only if ps s,, < Ps 
SBo Me 


payoff to Al = min{max{p%),s,, : Seo € Bo’s strategies} : S,, € Al’s strategies}, 


payoff to Bo = min{max{ps. s,, : Sa € Al’s strategies} : S,, € Bo’s strategies}. 


7.11. For noncooperative games with any number of players and any number of 
pure strategies, prove or disprove that if at least one player has at least one weakly 
dominant strategy, then such a strategy contains a Nash equilibrium. 


7.12. For noncooperative games with two players but any number of pure strate- 
gies, prove or disprove that if at least one player has at least one weakly dominant 
strategy, then such a strategy contains a Nash equilibrium. 


7.13. Analyze the game of Blue against Red defined by table 7.25 in example 7.24. 


7.14. Prove or disprove that the formulae from exercise 7.10 still hold for more 
than two strategies. 


7.3. Match Making 


Match making pairs up items or individuals from two groups. For example, the 
National Resident Matching Program (NRMP) uses an algorithm developed by 
David Gale, Alvin E. Roth, and Lloyd S. Shapley to match medical doctors with 
internships in hospitals [68, 96]. For their work on such problems, Alvin E. Roth 
and Lloyd S. Shapley received the Nobel Prize in Economics in 2012. The precise 
statements and proofs of their algorithms involve the mathematical concepts of sets, 
functions, relations, and induction. 


7.3.1 Introduction 


The problems considered here have been documented for millennia since Plato’s 
time (Figure 7.2): how to arrange for proper marriages [97, p. 27], and how to admit 
students to schools [127, p. 44, footnote 11]. The problems consist in matching in 
some “optimal” way individuals from two groups, for example, boys and girls for 
marriage, students and schools for education, doctors and hospitals for internships, 
or, more generally, beggars and choosers (table 7.26). “Optimality” may mean, for 
instance, that there are no beggar from one couple and chooser from another couple 
that prefer each other to their current mate. A similar optimality applies to schools 
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Fig. 7.2 Heracles and 
Athena, the goddess of 
wisdom, 480-470 B.C. (about 
the time of the Battles of 
Thermopylae and Salamis 
between Greek and Persian 
forces), by an olive tree 
presumably at the then future 
site of Plato’s Academy; 
photograph courtesy 
“User:Bibi Saint-Pol” via 
Wikipedia. (http://en. 
wikipedia.org/wiki/File: 
Athena_Herakles_Staatliche 
Antikensammlungen_2648. 


jpg) 


Table 7.26 Applications of 
Gale and Shapley’s 


“BEGGARS B | CHOOSERS Cc | RELATION M 


algorithms [68]. Boy | Girl Marriage 
Student | School | Admission 
Doctor | Hospital | Residency 
Recipient | Donor | Transplant 


admitting several students. To get a sense of Gale and Shapley’s work, the reader 
may at this stage try to design such optimal match-making procedures. 


7.3.2 A Mathematical Model for Optimal Match Making 


The first step in producing a successful match-making method consists in designing 
a mathematical model of the situation. With marriages, for instance, the group of 
boys may be modeled by a set B and the girls by a set C. To allow for applications 
more general than marriages and to shorten the prose, call the elements of B beggars 
and those of C choosers. Moreover, in the contexts considered here, the two sets B 
and C must be disjoint: BM C = @. The goal consists in marrying each boy exactly 
one girl and each girl to exactly one boy. In general, the goal consists in establishing 
a bijection between the two sets B and C, or, yet more generally, a relation M C Bx 
C. Yet not every relation leads to successful marriages, because each chooser prefers 
some beggars over others while each beggar prefers some choosers over others. Thus 
a mathematical model of the situation must also include such preferences. 

To specify preferences for certain choosers over others, each beggar b € B ranks 
all choosers by a strict well-order ; on C. Thus relative to ; each nonempty 
subset W C C has a unique first-ranked element, denoted by first,(W). 
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To specify preferences for certain beggars over others, each chooser c € C ranks 
all beggars by a strict well-order : on B. Thus relative to 2 each nonempty subset 
V C Bhas a unique first-ranked element, denoted by first,(V). 

The two disjoint sets B and C, with B well-ordered by each chooser and C well- 
ordered by each beggar, complete the mathematical model of the situation. The 
second step consists in designing a model of a successful relation. 

For each relation M C B x C, denote its domain by Domain(M) and its range by 
Range(M). Also, call “single” each beggar b’ € Bi, := B \ Domain(M) and each 
chooser c’ € Cy, := C \ Range(M): neither b’ nor c’ are related to anyone by M. 
Definition 7.27 specifies the notion of a successful relation by a concept of stability. 


7.27 Definition. A relation M C B x C is unstable if and only if at least one of the 
following conditions holds: 


com 


US.1) A beggar and a chooser prefer each other to their current mates: there 
are different couples (b, c1), (b2,c2) € M for whom b, : b, and c; - C2, a 
condition denoted by (b;, c,) >< (b2, c2). 

(US.2) There is a couple (b,c) € M and a see beggar b' € Bi, = 
B \ Domain(M) for whom b’ ~ b and c iv c’ for every “single” Bhodeer 
c’ € Cy = C \ Range(M), a condition denoted by (b,c) bx bY. 

(US.3) ‘There i is a couple (b,c) € _ and a “single” chooser c’ € Cy = C \ 
Range(M) for whom c’ * cand b 7, b’ for every “single” beggar b’ € Bi, = 


b 
B \ Domain(M), a condition denoted ‘by c’ pt (b,c). 


A relation M C B x Cis stable if and only if it is not unstable, so that none of the 
preceding three conditions holds. 


7.28 Definition. A stable relation M C BxC is total if and only if Domain(M) = 
and Range(M) = 


7.29 Example. The empty relation @ C B x C is vacuously stable. It is total if and 
only ifB=O=C. 


Specifications of the situation and goal do not yet suffice to reach the goal. To this 
end, the next step in mathematical modeling consists in developing an algorithm, 
method, or procedure to reach the goal from the current situation. One algorithm 
uses a match maker, another algorithm is carried out by the participants without a 
match maker. 


7.3.3 An Algorithm for Optimal Match Making 
with a Match Maker 


This subsection defines and demonstrates an algorithm for a match maker (person or 
machine) to find a stable relation. First, a match maker with access to all well-orders 
from all beggars and choosers can extend any stable relation. 
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7.30 Theorem. For each stable relation M C B x C with Domain(M) Cc B and 
Range(M) C C there exists a stable proper extension M withM C MC BxC. 


Proof. By the hypothesis that Domain(M) C B, there exists a “single” beggar b’ € 
By, = B\ Domain(M) # @. Since the well-order i. restricts to a well-order on 
Cy = C \ Range(M) # @, the “single” beggar b’ has a first-ranked single chooser 
c’ = firsty(C},). In particular, the set C, of those single choosers in Cj, that are 


ranked first among C\, by some single beggar in Bj, is not empty: 


Ci := {ce € Cy: Abb’ € By) A [c1 = firsty (Cy) ]} 4 2. 


Also, for each first-ranked single chooser c; € Cj, the set B., C Bi, of single 
beggars who rank cy first in C),, 


Ba = {b 28: ci = fist, (Ch. 


is not empty by definition and hence has a unique first element b,, := first,, (Be,). 
Moreover, if c) # co € C, then b., # b-,, because b,, ranks c, first, ahead of co. 
Thus the function C ~ Bip c; — be, is injective. Define 


My, := {(be,,€1) € By X Cy: (c1 € C1) A [be = first, (Be, )]} 
M:=MUM,>M. 
First, the relation M, is stable in By, x Cy: 
If (be,,¢1), (b,,.¢1) € Mj, then c; = firsty,, (Cy), whence c; ‘is 
al 


c’ € Cy, so that (bj,,c}) > (be,,¢1) and c’ > (be,,¢1) do not occur. Moreover, 
be, = first, (Bc, ), so that if b’ € By, and b’ = b,,, then b’ ¢ B.,, which means that 


al 


c’ for every 


there exists c’ € Cj, with c’ > c1, so that (be,,c1) > b! does not occur either. 

Second, M is stable in B x C: if (b,c) € M and (b,,,c1) € M,, then either 
b 7 b.,, in which case (b, c) >< (b,,, ¢1) fails, or b,, . b but then by the assumed 
stability of M in Bx C there exists a “single” chooser c’ € Ci, for whom c’ be c, yet 


on c. In either case (b,c) >< 


Cc, = firsty,, (Cj,), so that either c; = c’ or c be, € 
(be,,€1) fails. 

Similarly, if c : cy, then (be,,¢1) >< (b,c) fails; also, if c a c, then by the 
assumed stability of M there exists some “single” beggar b’ € Ci, such that b’ ~ b, 


c 


whence either b! = b,, or be, ~ b’ 7 b, and then (b;,,c1) >< b fails. oO 


big 


> 
bey 


Second, if the sets B of beggars and C of choosers have the same finite number of 
elements, then the algorithm that starts from the stable empty relation and iterates 
theorem 7.30 yields a total stable relation after finitely many iterations. 


7.31 Algorithm (Match Maker’s Algorithm). 

Initially, for all disjoint sets B and C, settM:= @ CBxC. 

While Domain(M) # B and Range(M) # C, find a proper extension M by 
Theorem 7.30, and re-set M := M. 
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JaDIE: 799: ake an BEGGARS | CHOOSERS C 

Shapley’s unlabeled example, B 

adapted from [41, p. 389]. : ua 21 4 
by 1,3 |2,2 |3,1 |4,3 
bo 1,4 |2,3 |3,3 |4,4 
b; 3,1 /1,4 |2,3 |4,2 
bg 2,2 |3,1 |1,4 |4,1 


7.32 Example. Consider Gale and Shapley’s unlabeled example [40, 41, p. 389], 
adapted in table 7.33. At the intersection of the row for b; and the column for c;, 
the ordered pair (7m;, nj) states that b; ranks c; in position m;, whereas c; ranks 5; in 
position n;. For instance, at the intersection of the row for b3 and the column for cp, 
the ordered pair (1, 4) states that b3 ranks cz in position 1, whereas cz ranks b3 in 
position 4. 

Start from the stable empty relation @ C Bx C. Thus BL, = Band C, = C. The 
set of first-ranked choosers is C) = {c1, C2, c3}. They are ranked first by 


Be, = {b1, bo}, Be, = {b3}, Be, = {by}. 
Among those beggars who ranked them first, their first-ranked beggars are 
bo, = first,, ({b1, b2}) = by, b., = b3, be, = by. 
Theorem 7.30 extends M := © to 
M = {(b, 1), (b3, €2), (ba, €3)}. 
where all beggars have their first choice. Theorem 7.30 then extends M to 
I = {(b1,c1). (Bs, €2), (bas cs)} U (bo, €4)}, 


where bz and cq have their worst choice, but then c4 was ranked last by every beggar. 


7.3.4 An Algorithm for Optimal Match Making 
Without a Match Maker 


This subsection describes Gale & Shapley’s algorithm to find a stable relation 
without any match maker [40, 41]. In the context of marriages, boys and girls carry 
out the algorithm themselves through rounds of proposals and rejections. 
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7.34 Algorithm (Gale & Shapley’s Deferred Acceptance Procedure). 
Initially, for each girl c € C, the set B, of boys who proposed to her is empty. 
Similarly, for each boy b € B, the set Cy of girls who rejected him is empty. 
Then each round of proposals and rejections proceeds as follows: 


(BG.1) Each girl c € C has a set B,, which is either empty, or is a singleton 
containing only (the name of) her top-ranked boy among those boys who have 
already proposed to her. 

(BG.2) Each boy b € B has a set Cy, which is either empty, or contains all the 
girls who have already rejected him. 

(BG.3) Each boy b € B proposes to his top-ranked girl c, := first(C\\C;,) among 
those girls who have not yet rejected him. 

(BG.4) From the boys’ proposals, each girl c € C receives a set B’. of proposals, 
which may be empty, she forms the union BY’ := BYU B,, and rejects all but her 
top-ranked boy in that set, in effect re-setting it to B, := {first.(BY)}. 

(BG.5) Each boy b € B who receives a rejection from a girl c € C appends her 
to his set of rejections, in effect replacing C, by C, U {c}. 


If the sets B of beggars and C of choosers have the same finite number of 
elements, then Gale & Shapley’s algorithm yields a stable relation after finitely 
many iterations, as proved in exercises 7.15 and 7.16. 


7.3.5 Exercises on Gale & Shapley’s Algorithms 


7.15. For disjoints sets B and C with the same finite number N of elements, prove 
that if in some round any boy b € B receives his (N — 1)-th rejection, then Gale & 
Shapley’s algorithm terminates at the next round. 


7.16. For disjoints sets B and C with the same finite number N of elements, prove 
that Gale & Shapley’s algorithm terminates after at most N? + 2 — 2N rounds. 


7.17. Prove or disprove that for all disjoint sets B and C with the same infinite 
cardinality there must exist a total stable relation. 


7.18 . Carry out Gale & Shapley’s algorithm with Gale & Shapley’s example 7.32. 


7.19. Extend algorithm 7.31 (with a match maker) to stable relations that may 
relate each chooser to more than one beggar, with a quota g, of beggars for chooser 
c (polygamy, polyandry, schools admitting more than one student, etc.). 


7.20. Extend algorithm 7.34 (without a match maker) to stable relations that may 
relate each chooser to more than one beggar, with a quota q, of beggars for chooser 
c (polygamy, polyandry, schools admitting more than one student, etc.). 


7.21. Denote by .” the subset of the power set A(B x C) consisting of all stable 
relations, partially ordered by inclusion. Prove that for each chain 7 C .Y, its union 
Ug := UZ € # isan upper bound for 7 in 7. 
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7.22. Determine the maximum number of iterations of theorem 7.30 (match 
making with a match maker) necessary to complete a total stable relation between 
disjoints sets B and C with the same finite number N of elements. 


7.23. Design and test a program to iterate theorem 7.30 (match making with a 
match maker) for disjoints sets B and C with the same finite number N of elements. 


7.24. Design and test a program to iterate theorem 7.30 (match making with a 
match maker) for disjoints finite sets B and C with quota. 


7.3.6 Projects 


7.35 Project. Develop concepts, theorems, and algorithms for match making where 
beggars might order any subset, not necessarily the whole set, of choosers, and 
choosers might order any subset, not necessarily the whole set, of beggars. For 
example, the National Resident Matching Program (NRMP) evidently uses an 
algorithm to this effect [96]. 


7.36 Project. Develop concepts, theorems, and algorithms for tri-partite match 
making. (Tri-partite reproduction occurs in Isaac Asimov’s novel The Gods Them- 
selves [5]. See also the “three-parent” therapy [2].) 


7.4 Arrow’s Impossibility Theorem 


Atrow’s Impossibility Theorem exposes some of the limitations inherent to voting: 
several desired features of voting procedures are mutually incompatible. Jointly with 
John R. Hicks, Kenneth J. Arrow received the Nobel Prize in Economics in 1972, 
in part for his Impossibility Theorem. Its precise statement and proof involve the 
mathematical concepts of sets, functions, and ranking relations. 


7.4.1 Introduction 


The problem considered here consists in designing a procedure to rank several 
mutually exclusive alternatives, or merely to choose exactly one such alternative. By 
law or otherwise, the voting procedure may have to take into account many decision 
criteria from the voters, and may have to conform to rules imposed in advance by 
the voters. 

In the political arena, the decision criteria may be ballots submitted by voters, 
while the alternatives may be persons who are candidates for public office. Different 
voting procedures may lead to the election of different candidates. 
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Fig. 7.3. The Mars Climate 
Orbiter is conjectured to have 
followed too low a trajectory 
and burned up in Mars 
atmosphere; art work 
courtesy Jet Propulsion 
Laboratory (JPL) and NASA. 
(http://www. jpl.nasa.gov/ 
jplhistory/the90/images/ 
climate-orbiter-browse.jpg) 


7.37 Example. In the United States presidential election of 1824, the popular and 
electoral votes had ranked Andrew Jackson (Democrat-Republican Party) ahead of 
John Quincy Adams (Coalition Party), William H. Crawford, and Henry Clay (Whig 
Party); nevertheless the House of Representatives elected John Quincy Adams ahead 
of Andrew Jackson (http://www.archives.gov/federal-register/electoral-college/ 
scores.html). 


In the scientific arena, the decision criteria may be measurements from sensors, 
which may act as voters, while the alternatives, which play the rdles of candidates, 
may be decisions on how to proceed with a mission. In some missions, a simple 
majority of votes may fail to select the best alternative (Figure 7.3). 


7.38 Example. The Mars Climate Orbiter spacecraft was lost on 23 September 
1999, due to small thrusters controlled by faulty unit conversions that compounded 
during the year-long flight, which had been tracked by three telemetric systems: 
Doppler only, range only, and Doppler and range combined. Two out of the three 
systems submitted the same vote and won. Range only, and Doppler and range 
combined, both showed a flight path clearing the planet, allowing the mission to 
proceed without corrections. Both systems were wrong and the probe crashed. 
The minority vote was right: “The Doppler-only solutions consistently indicated 
a flight path insertion closer to the planet. These discrepancies were not resolved” 
[78, p. 13]. In some situations, discrepancies must be investigated, not voted away. 


Among other procedures, the voting procedures considered here are based on 
voters ranking all the candidates. 


7.39 Example. Table 7.40 shows Judges’ (voters’) ranking of three skaters (candi- 
dates) at the 1994 Winter Olympic Games in Lillehammer, Norway [112, p. 22]. 
The judges’ plurality voting procedure selects the skater rated first by the largest 
number of judges, here Baiul ahead of Kerrigan. 
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Table 7.40 Judges’ rankings of skaters [112, p. 22]. 


VOTERS 
SKATERS |D | PL |CZ | UA |PRC | USA |J | CDN | UK 
Baiul 1 |1 1 1 1 2 2 |3 3 
Kerrigan |2 | 2 2 2 2 1 1/1 1 
ChenLu |3 |3 3 3 3 3 3 |2 2 


Table 7.42 Voters’ rankings of three candidates [113, p. 448]. 


VOTERS’ RANKINGS OF CANDIDATES 
RANKS |0 1 2 3 4 5 6 7 8 9 x 
Top Al |Al |Al |Al |Al |Bo |Bo |CGi |Gi |CGi | Gi 
Second | Bo |Bo |Bo |Ci |Ci |Ci |Ci |Bo | Bo | Bo | Bo 
Last Ci |Ci |Ci |Bo |Bo |Al |Al |Al |Al |Al | Al 


Table 7.43 Voters’ rankings of the two remaining candidates. 


VOTERS’ RANKINGS OF CANDIDATES 
RANKS |0 1 2 3 4 5 6 7 8 9 x 
Top Al |Al |Al |Al |Al |Bo |Bo |Bo |Bo |Bo | Bo 
Second | Bo | Bo |Bo |Bo |Bo |Al |Al |Al | Al | Al |Al 


Example 7.41 exposes some peculiarities of plurality voting. 


7.41 Example. Table 7.42 shows hypothetical rankings of three candidates by 
eleven voters, adapted from [113, p. 448]. 

The method of election called Borda’s count attempts to take into account 
voters’ rankings by allocating a candidate two points for each top choice and one 
point for each second choice from the voters. Table 7.42 shows that Ci leads with 
with (4 x 2) + (4x 1) = 12 points, followed by Bo (2 x 2) + (7 x 1) = 11 points, 
and Al with 5 x 2 = 10 points. The press might list the outcome as Ci > Bo > Al. 
Thus Ci is elected. 

The method of election called plurality voting ignores voters’ second and 
subsequent choices, takes into account only each voter’s top choice, and elects the 
candidate who is the top choice of most voters. Table 7.42 shows Al leading with 5 
top choices, followed by Ci with 4, and Bo with 2. Thus Al > Ci > Bo, and Al is 
elected. 

Suppose now that Ci drops out (as did Ross Perrot in 1992). Table 7.43 shows 
the voters’ rankings of Al and Bo from table 7.42. With plurality voting, Bo wins 
with 6 votes and Al loses with 5 votes: Bo > Al, and Bo is elected. 

Such a reversal of the election result from Al to Bo, caused by a change in 
the ranking of a seemingly irrelevant third candidate, Ci, is one of the features of 
plurality voting that is deemed undesirable by some voters. 
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7.4.2 A Mathematical Model for Arrow’s Impossibility 
Theorem 


Many features have been deemed desirable from a voting procedure, for instance, 
the following features. 


(AIT.1) Unrestricted Domain. Each voter may submit any ranking of the candi- 
dates: no rankings are forbidden. 

(AIT.2) Unanimity. If every voter prefers candidate Al to Bo, then the voting 
procedure must rank Al ahead of Bo. 

(AIT.3) Independence of Irrelevant Alternatives. If every voter prefers candidate 
Al to Bo in two different elections, then the voting procedure must rank Al 
ahead of Bo in both elections, independently (regardless) of changes in the 
ranking of any candidate Ci other than Al and Bo between the two elections. 

(AIT.4) Absence of Dictators. The ranking from the voting procedure does not 
coincide with the ranking of any fixed voter: no voters can dictate the outcome 
of all elections. 

(AIT.5) Anonymity. The ranking from the voting procedure does not depend 
on who cast what vote. (Anonymity may be desirable for public votes, but 
undesirable for the votes of elected representatives or scientific sensors.) 


To get a sense of Arrow’s work, the reader may at this stage try to design a voting 
procedure with all the features just listed to elect one among three candidates: either 
find such a voting procedure, or prove that there are none. 

Arrow’s Impossibility Theorem states that the first four features, (AIT.1), 
(AIT.2), (AIT.3), and (AIT.4) are already mutually incompatible: no such voting 
procedures are possible. As just stated, the four features are somewhat vague. For 
instance, a voting procedure might need only to accommodate finitely many voters. 
A precision sufficient for rigorous reasoning may have to be inserted and culminate 
in a mathematical model of the descriptions of the features. To get a deeper sense 
of Arrow’s work, the reader may at this stage try to formulate those four features 
mathematically. 

In general, a set V of voters must rank a set @ of mutually exclusive alternatives, 
for instance, candidates, decisions, laws, policies, etc. To this end, each voter 
submits one ranking of the set @ of candidates. From the rankings submitted by 
the voters a “social welfare function” ¥ produces a final “aggregate” ranking of the 
set @ of candidates (Table 7.44). 


7.45 Definition. A weak ranking allowing for ties on a set @ is a transitive and 
strongly connected relation R C @ x @. 

To allow for ties, weak rankings need not be anti-symmetric; thus weak rankings 
are not partial orders. 

The notation (X, Y) € R, also abbreviated by XRY, means that R ranks X before 
or tied with Y. 

The transitivity of R is defined by VXVYVZ {[(XRY) A (YRZ)| > (XRZ)}. 
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Table 7.44 Symbols from logic. 


SYMBOL DEFINITION 

=(P) “not P”: True if and only if P is False. 

(P) A (Q) “P and Q”: True if and only if P and Q are both True. 

(P) Vv (Q) “P or Q”: False if and only if P and Q are both False. 

(P) => (Q) | “P implies Q”: False if and only if P is True and Q is False. 
Vv “for each”; 


| “there exists”. 


The strong connectedness of R is defined by 


VXVY {[(X € @) A(Y € @)] = [(XRY) V (YRX)]} . 


In particular, for all elements X and Y in @, if X = Y, then XRX, so that R is 
reflexive. 


The inverse ranking is the inverse relation R°-! := {(Y,X) ©€ @x@: 
(X, Y) € R}. 

The preference Pr associated with the ranking R is the set-theoretic difference 
Prp:=R\R. 


A weak ranking R may also be denoted by such a symbol as > so that X > Y 
means XRY. 

The associated preference Pr may then be denoted by > so that X > Y means 
(X= Y)A[->W = X)]. 

The set of all weak rankings of a set @ is denoted by Z(@). 

A voters’ profile is a functionr: V > @(@) with domain Y. 

To each voter V € Y, a voters’ profile r associates that voter’s ranking, r(V) € 
&#(E€), also denoted by = . 

The voter’s preference associated with that voter’s ranking r(V) is then also 
denoted by P;,y) or by - : 

The set of all voters’ profiles, which are all functions from VY to Z(@) with 
domain Y, is denoted by A(@)” . 

A social welfare function is a function % : ZC)” > AYE) with domain 
ROE)”. 

To each voters’ profile r, a social welfare function .¥ associates an “aggregate” 
ranking ¥(r) € FE). 


7.46 Definition. A ranking R C @ x @ induces a ranking Ry y := RN ({X, Y} x 
{X, Y}) of two candidates X, Y € @ relative to each other. Two rankings R,S C 
© x © rank two candidates X, Y € @ in the same order if and only if Ry y = Syy. 
Two voters rank two candidates in the same order if and only if so do their rankings. 


The problem considered here consists in investigating the feasibility and design 
of a social welfare function. 
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Table 7.47 Symbols for Arrow’s Impossibility Theorem. 


SYMBOL DEFINITION 

C Set of all candidates. 

K(E) Set of all rankings on @. 

V Set of all voters. 

r Voters’ profile: a functionr: YW > #(@) with domain Y. 

RC)” Set of all voters’ profiles: all functions Y > &(@) with domain Y. 
F Social welfare function: a function % : #(@)” > #(@). 
RCSF (@)71 | Set of all social welfare functions: all functions RC)” > RE). 
X7Y Alternative notation for (X, Y) € P,y): voter V prefers X to Y. 


7.4.3 Statement and Proof of Arrow’s Impossibility Theorem 


Not all social welfare functions are considered here: only those that satisfy the 
following four conditions. 


(AIT.1) Unrestricted Domain. The domain of each social welfare function .F is 
all of #(@)”. This means that each voter may submit any ranking of the 


candidates for the social welfare function to produce an aggregate ranking 
(Table 7.47). 

(AIT.2) Unanimity. For all candidates X and Y in @, if in a voters’ profile r each 
voter V in Y prefers X to Y, then so does the aggregate ranking from the social 
welfare function .F: 


VXVYVr({[ré€ RE)” | AVV[(X, Y) € Pry} > (X,Y) € Pz))- 


(AIT.3) Independence of Irrelevant Alternatives. If every voter V € ¥Y ranks 
candidates X and Y in the same order in two voting profiles r and s, then the 
aggregate rankings ¥(r) and .¥(s) also rank X and Y in the same order: 


VxvYvrvs {(VV {ir,s € BC)” ] A [(X, ¥) € r(V) N8(V)]}) 
=> (X,Y) € FINN F(s)}. 


(AIT.4) Absence of Dictators. The social welfare function does not coincide with 
the ranking of any voter: 


= {av [vr ({Ir € @O)”] > [FO = r(V)}})]} . 


7.48 Definition. A permutation of a set V is a bijectiona : ¥ > VY with domain 
¥Y and range Y. 

The set of all permutations of a set VY is denoted by .“y and called the symmetric 
group of ¥. 
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For all elements Bo, Ci € @, the transposition T,,,,; is the permutation 3,4 € 
Sy that swaps Bo and Ci but fixes every other element Al € @ \ {Bo, Ci}: 
Taoci(Bo) := Ci, 
Tgo,ci(Ci) := Bo; 
VAL{(Al € @ \ {Bo, Ci}) => [Tg.,c(Al := Al]}. 


A voter V € Y swaps two candidates Bo and Ci by changing from a ranking = to 
the ranking defined by 


(X,Y) €@X ©: tac(X) 7 Tma(¥)}. 


Arrow’s Impossibility Theorem applies to every set Y of voters that is, or can be, 
strictly well-ordered by a relation < so that each nonempty subset & C ¥V has a first 
element first(&) and Y has a last element last(W). The strict well-order < remains 
fixed for the entire proof. For instance, some sets of voters may be in alphabetical 
order. 


7.49 Theorem (Arrow’s Impossibility Theorem). No social welfare functions 
satisfy all four conditions (AIT. 1)-(AIT.4). 


Proof. This proof expands on [4, 6, 42, 143], showing that every function satisfying 
(AIT.1), (AIT.2), (AIT.3) violates (AIT.4). For all distinct candidates Al, Bo € @, 
at one extreme focus on any profile r where every voter V € Y prefers Al to Bo, so 
that (Al, Bo) € P,y) for every Vc ¥: 


VVU(V E V) > [(Al, Bo) € Pry}. 
By the rule of unanimity so does the aggregate ranking: (Al, Bo) € P.gq). 
At the other extreme, focus on any profile s where every voter V € ¥ prefers Bo 
to Al, so that (Al, Bo) € Psy) for every V € ¥: 
VVi(V € V) = [(Bo, AD € Psyy]}. 
By the rule of unanimity so does the aggregate ranking: (Bo, Al) € P.gis). 
Between extremes, for each voter V € ¥ define a profile ry € Z(@) such that 
each voter U < V prefers Bo to Al, so that Bo a Al for each U < V, while each 


voter W > V prefers Al to Bo, so that Al Bo for each W > V: 


UxV> (Bo, Al) E Pry); 
V<x~Ws> (Al, Bo) € Pryw): 


If V = last(V), then ry = s and ¥(ry) = &(s) ranks Bo ahead of Al. 
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Let &;,/,, be the subset of voters for whom .¥ (ry) ranks Bo ahead of Al: 
Eo/ a1 = {V ey: (Bo, Al) € Pgry)}- 


Thus last(V) € &/x # @ and &},/, has a first elements V,,/,, := first(é,,/s,). This 
first element V,,/,, is called a pivotal voter for Bo over Al. 

By the rule of Independence of Irrelevant Alternatives V,,/,, is a pivotal voter 
for Bo over Al starting from every profile r’ where every voter ranks Al before Bo. 
Indeed, for each voter V € ¥Y, in 7’ and ins, also in s’ and ins, as well as in r', and 
in ry, each voter U < V, V, and W > V ranks Al and Bo relative to each other in 
the same way. 

The following considerations show that V,,/,, can also dictate the ranking of Bo 
relative to any third candidate Ci € @ \ {Al, Bo}. To this end, consider the voters’ 
profile 7y defined for V := Va, by 


U<V: Bo 7 Ci 7 AL 
dv V: Al = Bo * Ci, 
V <W: Al = Bo 7, Ci. 


Since V = V,,/, is pivotal for Bo over Al, the aggregate ranking .Y (vy) still ranks 
Al > Bo. Also, by unanimity ¥ (qv) ranks Bo > Ci. By transitivity the aggregate 
ranking ¥ (vy) ranks Al > Bo > Ci. If V = Vj,/,, swaps Al and Bo, then the voting 
profile changes from 4y to ry, defined by 


U<V: Bo cr | Al 
ry V: Bo, Al Gi, 
V <W: Al % Bo %, Ci. 


Since V = V,,/,, is pivotal for Bo over Al, the aggregate ranking .¥ (ry) now ranks 
Bo > AI. By Independence of Irrelevant Alternatives ¥(ry) still ranks Al > Ci, 
because each voter ranks Al and Ci in the same order in both profiles 7y and ry. By 
transitivity, the aggregate ranking ¥(ry) ranks Bo > Al > Ci. If any, some, or all 
voters other than V = V,,/,, Swap Bo and Ci, then the voting profile changes from 
ry to Sy, defined by 


. o> > > : > 
U<V: ot g Bog Aor Beg Sg Ak 
Sy V: Bo |, Al | Gi, 
4 > : > > > : 
V<W:AI , Ci y BoorAl ,, Bo y Ci. 


Then Al and Bo are ranked in the same order by every voter in ry and sy, whence the 
aggregate ranking .¥ (sy) still ranks Bo > Al. Similarly, Al and Ci are ranked in the 
same order by every voter in ry and sy, whence the aggregate ranking .F(sy) still 
ranks Al > Ci. By transitivity the aggregate ranking ¥(sy) ranks Bo > Al > Ci, 
even though every voter other than V ranks Ci > Bo. 
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Candidate A] plays only an auxiliary rdéle in the proof that V,,/¢ is a dictator for 
Bo over Ci. If any, some, or all voters swap Al and any candidate Ig other than Bo 
and Ci, then the voting profile changes from Sy to Py, defined by 


. : > > > : > > 
U<V: Ci y Bo o Al or::: e Ci yy Bo tee 
Py V: Bo 7 Al 7 Cior--- 7 Bo 7 Ci 7: ? 
V<W:Al 7, Ci wy Boor 5. Ci pp BO ye 


Then Bo and Ci are ranked in the same order by every voter in sy and Py, whence 
the aggregate ranking Y(Py) still ranks Bo > Ci, even though every voter other 
than V ranks Ci > Bo. 

For the uniqueness of the dictator, in the strict well-ordering < of the voters V, 
a pivotal voter V,,/c, for Bo over Ci cannot appear later than such a dictator for Bo 
over Ci as V3,/,:, because such an earlier dictator V,,/,; would determine the ranking 
of Bo over Ci before a pivotal voter V,,/-, does; consequently, V3,/¢ X Vao/a:- 

Similarly, a pivotal voter V./,, for Ci over Bo cannot come earlier than such a 
dictator for Bo over Ci as V,,/,:, for otherwise such a later dictator could reverse a 
pivotal vote from the pivotal voter; therefore Vj,/a1 X Vei/po- 

Combining Vo0/ci x Voso/at with Voso/at x Vii/Bo gives Vaso/ci S Vaso/at = Voi /B0 by 
transitivity. 

Reversing the rdles of Bo and Ci gives the reverse ranks Voj3, < Vesa X Veo/c 

Consequently, Voo/ci = VBo/Al = Vei/Bo+ 

Therefore there is exactly one pivotal voter, who is the pivotal voter and the 
dictator for every pair of candidates. 

Thus every social welfare function satisfying (AIT.1), (AIT.2), (AIT.3) violates 
(AIT.4). oO 


Another rule that might be imposed on a social welfare function pertains to the 
anonymity of voters. One way to model the concept of anonymity of voters consists 
in requiring that a social welfare function remains invariant under all permutations 
of the voters. 


7.50 Definition. A social welfare function ¥ : AC)” — A) is invariant 
under permutations if and only if ¥(roa) = &(r) for every profiler: VW > 
&(EC) and every permutation 0 € Hy. 


(AIT.5) Anonymity. A social welfare function respecting voters’ anonymity is 
invariant under permutations. 


Yet another condition attempts to model the condition that no voters have a right 
of veto. 


(AIT.6) No Veto. Except perhaps for voter Val, if none of the other voters prefer 
any other candidate to Al, in other words, if all the other voters prefer Al to all 
the other candidates, or are indifferent between Al and all the other candidates, 
then the social welfare function does not rank any other candidate ahead of Al, 
regardless of Val’s rankings [115, p. 386]. 


There is an alternative variation of the concept of unanimity. 
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(AIT.7) Weak Unanimity. Except perhaps for voter Val, if all the other voters do 
not prefer Bo to Al, so that they might prefer Al to Bo or be indifferent between 
Al and Bo, but Val prefers Al to Bo, then the social welfare function prefers Al 
to Bo: 


VXVYVr ({r € AE)" | A(X.) € Pyryanl AVVI.X) € Prowl} 


=> [(X,Y) € Pg). 


7.4.4 Exercises on Arrow’s Impossibility Theorem 


7.25. Prove that for each relation R C @ x @ the relation Peg := R \ R°~! is 
irreflexive. 


7.26. Prove that for each relation R C @ x @ the relation Pe := R \ R°~! is 


asymmetric. 


7.27. Prove that if a relation R C @ x @ is transitive, then Pr := R \ R°~! is also 
transitive. 


7.28. Prove that for each relation R C @ x @ the relation Pr := R \ R°~! is 
anti-symmetric. 


7.29. Prove that if a relation R C @ x @ is transitive, then Pe := R\ R°~! isa 
partial order. 


7.30. Prove that if a relation R C @ x @ is transitive, then the inverse relation R°~! 
is also transitive. 


7.31. With at least two voters and at least two alternatives, prove that every social 
welfare function conforming to the rules of Unrestricted Domain and Anonymity 
has no dictators. 


7.32. With at least two voters and at least three alternatives, prove that every social 
welfare function conforming to the rules of Unrestricted Domain, Unanimity, and 
Independence of Irrelevant Alternatives is not invariant under permutations of the 
voters. 


7.33. Determine whether either of Unanimity and Weak Unanimity implies the 
other. 


7.34. Determine whether Weak Unanimity is compatible with No Veto. 
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Exercises from Chapter 1 


1.1 Use axioms P1 and P2, theorem 1.12, and Modus Ponens: 


F (L) = [(K) > (L)]_ axiom P1, 
F (A) => {(L) > [(K) > (D)]} theorem 1.12, 
F [(A) > (1) > [(K) > OB] > (> ©) > (A) = [(K) > HP) axiom P2, 
F [(A) > (L)] => {(2) => [(K) > (E)}}) ~~ Modus Ponens. 


1.3 Use the reflexivity of implications and the law of commutation: 


F {[(A) = (B)] = (B)} = {[) = (B)] > (B)} P:= {[(A) > @)] 
= (B)} in 1.14, 
F [(A) > (B)| > [{[(A) (B)] > (B} => (B) | substitution in the law 
rr aad — ae of commutation. 


1.5 Substitute P for A and also P for B in the preceding tautology: 


F [(A) > (B)] > [{[(A) (B)| > (B)}} => (B)| preceding tautology, 
F [(P) > (P)| > [{[(P) (P)| > (P)} => (P)| substitutions, 

F (P) => (P) theorem 1.14, 

F {{(P) > (P)] > (P)} = (P) Modus Ponens. 


1.7 The formula {[(P) > (Q)] (P)} (P) cannot be proved using only 
implications. 


1.9 Use the tautology [(H) > (L)] > {(A) => [(K) > (D]}: 


FIR =O> 3S) >OQOl=l6) =>_) } 
—_—<—— —S— —_ —<—— —— —S—’ 
L L 


H AH K 
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1.11 See also [72, p. 34]: 
F (P) = {[(P) = (P)] 


(P)} 


 {(P) => [(P) > (P)]} 
([(P) => {[(P) 
>I?) SP) 


(P)] > (P)}] 


F (P) > [(P) > @)] 
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substitution in axiom 1a, 


substitution in axiom 1b, 


substitution in axiom la, 


F [(P) > {(P) > (P)] > (P)}] 


[(P) > (P)] 


F [(P) => (P)] 
1.13 Apply exercise 1.11 and Detachment: 
F [(P) = (P)] => [IP) > [(P) > (I 
= [(P) = @)] 


F {[(P) > [P) >= @B => [P) > @I 
1.15 Apply exercise 1.12 and Detachment: 

+ T 

F (A) > (7) 


F [(A) = (T)] > ({(A) > (7) > (OB 
= [(A) > (O)]) 
F {(A) > [(7) 
1.17 
 [(B) > (C)] > {(A) > [(B) 
+ ([(B) > (C)] = {(A) 


(O} = (A) = ©] 


(C)]} 


Detachment. 
Detachment. 


substitution in 1b, 


Detachment. 


hypothesis, 
exercise 1.12, 
substitution in 1b, 


Detachment. 


[(B) = (O}}) > 


[(B) = (O} 


[(A) 


({(8) > ©] > (A) 
{(B) > (© > (A) 


(Oo) 


(c)])} 
1b, 


F {[B) > (©) > ((A) = [B) 


[(B) > (C)] > (A) 


(C)]} 


(C)}} = [(A) > ())} 


Detachment. 


1.19 Apply exercises 1.15, 1.18, and Detachment: 


F [(A) = (B)] = ({(A) = [(B) > (C)} 


[(A) > (O)]) 1b, 


F [(B) > (©) 


(CB = [A) 


Le, 


{[(A) = (B)] = ({(A) > [B) 
F {[(B) > (C)] = [A) = @)]} 


(C)])} 


{[) + (Ol > ({) 


[(B) > (C)]} > [A) 


(c)))} 


1.21 + [(A) > (B)] > {{(B) => (©) 


[(A) = (B)]} by substitution in la. 
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1.23 Apply Tarski’s axiom III and Detachment: 


+ (H) = (K) hypothesis, 

F [(A) > (K)| > {[(K) > (2)] => [(A) => (L)]} substitution in axiom II, 
F[(kK) = ®] => (= Detachment, 

F (K) = (L) hypothesis, 

+ (H) = (L) Detachment. 


1.25 - (P) > {[(P) > (Q)] => (P)} by substitution in axiom I. 
1.27 Apply exercises 1.23, 1.25, and 1.26: 


(P )=((P) >= @1= (P) >= @1 => @)). 


— 
H 


L 
1.29 Apply exercises 1.27, 1.28, and 1.23: 


(P) > t[(P) >= (Q)] > (Q)}. 
1.31 Apply exercise 1.29 and axiom III: 
F{(Q) = [(Q) > (®)) > Re => 
[1 > @] > @} > () > @®) 


() > (P) > | I, 
F (Q) = [(Q) = (R)] > (8) 1.29, 
F ({[(Q) => (R)] > (R)} => [(P) > (®))) 
{(Q) = [(P) > (R)]}} Detachment. 


1.33. Apply axiom III, exercise 1.32, and Detachment: 

F [(P) > (Q)] > tQ) = (R)] = [(P) = (R)]} axiom IIT, 

F [(Q) = (R)] = tl) > (Q)] = [(P) = (R)]} 1.32, Detachment. 
1.35 Apply axiom II with exercises 1.24, 1.33, and 1.23: 


F ([(P) > (Q)] > {(P) > [(P) > (23) 
= (OH) S1h) > @BF 1H) > @) > 1H) > @> ll) > (@®]} UL 


F ({(P) > [(P) > @} => ((P) > @))) 
= ([(P) > OQ) >= {&) = [(P) > @}) > 1) > @) > [f) > @]} 1.32, 


F {(P) > [(P) > (R)]}} > [(P) > (®)] I, 
F ((P) > (Q)) > {(P) > [(P) > ®@)}) > (P) > ©] > [(P) > @} Detachment, 
F{Q>[P) > @BP> (PF) > OF (fH) > @B 1.34. 


[(P) > (Q)] > (LP) => [(Q) > (R)}} > [P) > (YH). 
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1.37 Proceed as in theorem 1.40: 


F (P) = [(Q) => (P)] axiom F1, 
F [(Q) = (P)] = {[-@)] = [-(Q)]} axiom F4, 
F (P) > {[7(P)] => [F(Q)]} transitivity of implications (theorem 1.16). 


1.39 A proof of {[-(Q)] > [-(P)]} => [(P) > (Q)] can proceed as follows. 


F {[-(@)] > F-@))}} > (FE => {-[-(@)]}) axiom F4, 


F (P) = {-[-()}} axiom F6, 

F {[-(Q)] > [-(P)}} => ((P) => {-[-(Q)]}) transitivity (theorem 1.31), 
F {-[-(Q)]} > (Q) axiom FS, 

F {[=(Q)] > [A(P)]}} > [(P) > (Q)] transitivity (theorem 1.32). 
1.41 


{[((B) > (F)] > (4) > (3 > (LB) > /] > (4)} > {(B) > (F)] > (F)}) axiom C2, 
{[(B) > (F)] > (F)} => (B) axiom C3, 
{[(8) > ()] > (4) > (HB > [{B) > ()] > (4)} > ()] transitivity, 
(A) > {[(B) > (F)] > (A)} Cl, 
{[(B) => (F)] => [(A) = (F)]} > [(A) = (B)] transitivity. 


1.43 Substitutions of =(B) for (B) => (F) and —(A) for (A) => (F) into the solution 
of exercise 1.41 transform Church’s third axiom into {[-=(Q)] => [-(P)]} => 
[((P) => (Q)]. Consequently, all three axioms of classical logic remain valid in 
Church’s logic, and, therefore, Church’s logic allows for proofs of all the theorems 
of classical logic. 


1.45 For the first theorem, 


F [>(P)] = {[(S) => (S)] > [-(P) axiom P1, 

F {[(S) > (9) > FO} > K-F@)]} > {-((5) => (5)]}}) law of contraposition, 

F [>(P)] > ({(-[(P)]}} > {-[(5) > (5)]}) transitivity, 

F (P) = {-[-(P)} converse double negation, 
F [>(P)] > [(P) > {-16) > ()3] transitivity. 


For the second theorem, 


F (S) => (S) theorem 1.14, 
F (P) => [(S) > (S)] theorem 1.12, 
F [(P) > (Q)] > ({(P) => [-(Q)} > [-(P))) reduction ad absurdum, 


F {(P) > [(S) > (S)]} > {[P) > {[(S) > (S)]}}] S [-(P)]} substitution, 
a [(P) => {-[(S) > (S)}] => [|F(P)] Modus Ponens. 
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1.47 Pierce’s law: 
[P) > (Q)] = (P)} = (P) 
{definition of v, 
[-(-tI-@)] v (Q)} v (P)) v (P)] 
¢ de Morgan’s second law, 
{FIF@1v @})] 4 F@)} v @) 
¢ double negations, 
({[-@)] Vv (Q)} A [-@)]) v ) 
¢  distributivity, 
KI-Plv @}v MP] A(-@)] v (P)} 
¢ commutativity, associativity, 
excluded middle, 


[((Q) v (T)] A (7) 
¢ identity, 
(T) A (T) 
¢  idempotence, 
(T) 


(PH) > OM B= (R= Ol > MP} 


? 


definition of V, 


= 


[-=(={[-(P)] Vv (Q)} Vv (R)})] V [FI @)] v (P)} Vv (P) 


? 


de Morgan’s second law, 


{FCP vO] A F@} v [((-F@B A Fey) v 
¢ double negations, 
({-@)] Vv (} A FR) v [R)A FOB v )] 
¢ — distributivity, 
({I-(@)] A [>)}} V {(Q) A [>@)}) V ([R) Vv @)] A {-@)] v (PH) 
¢ excluded middle, 


Se 


({[-@)] A QB Vv () A [7(R)3) V IR) V @]A 


°? 


identity, 


({I-(@)] A [73 V {() A [7@®)}) V [(®) Vv 


¢ de Morgan’s second law, 
({-[@) V (BV {) A [7] V [R) V (P)] 

¢ commutativity, 

associativity, 

({-[®) Vv (PB v [(R) V PV {(Q) A F@)B) 

¢ excluded middle, 

(T) V {Q) A [>R)B) 
¢ identity, 


(7) 
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1.51 

F (P) = {[(P) > (Q)] > (Q)} theorem 1.26, 

F {[(P) > (Q)] > (Q)} > ([-(@)] > {-[(P) > (Q)]}) theorem 1.44, 

F (P) > ([-()] > {-[) > (O)}}) theorem 1.16. 


1.53 One implication is axiom P1 and the other implication is Pierce’s law. 


1.55 [(P) > (Q)] + (-{(P) A [-@)]}): 


(P) > (Q) 
t double negations, 
(P) > {-F-@)]} 
t double negations, 
=(-[) > -@)}) 
t definition of A, 
(“{(P) 4 [-@)1}) 
1.57 

r (V) => (W) hypothesis, 

F (=(W)) > (-(V)) contraposition, 

F (R) = (S) hypothesis, 

F [-(V)] > [(R) => (S)] theorem 1.12, 


F {> = [R) = (Sb 
= {I-M)) = (8) > [((W)) = (S)]} axiom P2, 


F [(-(V)) > (R)] > (0) = ()] Modus Ponens, 
F [(-=(V)) > (R)] > [-W)) = (S)] second line and theorem 1.16, 
F [(V) v (R)] = [(W) v (S)] definition of v. 

1.59 No, the suggested rule fails if U and W are False but V is True. 

1.61 


+ ([(P) => (S)] A {[-(Q)] > [-()}}) 
=> ({(P) A [-(Q)]} > {(S) A [-(5)}}) theorem, 


F ({(P) A [-(Q)}} > (8) A [-(8)}) = 
[(-U(S) A [7(8)]}) > (FP) A [-(Q)]})]__ contraposition, 
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F [(-{5) A FS) 3) > (FP) A TQ) | 


=> ({[-(S)] v (S)} > [(P) > (Q))) de Morgan, definition of A, 
F [=(S)] v (S) excluded middle, 
F (P) > Q) Modus Ponens. 


1.63 
F (R) => [(T) > (R)] axiom P1; 


F (A) => {[(A) > (B)] => (B)} theorem, 
F(T) > {[(7) > (R)] (R)} substitutions in theorem, 
+ T hypothesis, 
[(T) = (R)| => (R) Modus Ponens. 


1.65 Use {[=(R)] > (F)} = t[-()] = (R)} and [(7) = (R)] = (8). 


1.67 Definition 1.51 of (A) = (B) as [(A) > (B)JA[(B) => (A)] with theorems 1.57 
[(P) A (Q)] = [(Q) A (P)] and 1.52 [(P) A (Q)] = (Q) yield Tarski’s axiom IV. 


1.69 Theorem 1.61 gives a derivation of (J) = (J). from (J) > (J) and (J) => (J). 
The Deduction Theorem (1.22) then yields a proof of [(() > (V)] => {{V) => 
DO] > () > VI}. 


1.71 Subsections 1.3.10, 1.3.11, and 1.3.12 show that axioms Pl and P2 are 
derivable from Tarski’s axioms I-III. Moreover, axiom P3 coincides with Tarski’s 
axioms VII. Consequently, axioms Pl, P2, and P3 are derivable from Tarski’s 
axioms I—VII. Therefore, every theorem of the Classical Propositional Calculus is 
also derivable from Tarski’s axioms I-VII. 


1.73 Substitute R for S in theorem 1.82, which gives {[(P) > (Q)] A [(R) => 
(R)}} => {[P) A (R] => [(@) A (R)]}. Then apply the reflexivity of the 
logical implication (theorem 1.14), the law of contraposition (theorem 1.44), and 
transitivity, to derive Rosser’s axiom R3 from axioms P1, P2, and P3. 


1.75 Kleene’s axiom 7 is the law of reductio ad absurdum (theorem 1.48). 


1.77 {{(P) > (@)] (P)} => (P) is a triadic tautology in Lukasiewicz’s triadic 
logic. 
1.79 {[(P) > (Q)] => (P)} => (P) is not a triadic tautology in Church’s system or 


in Lukasiewicz’s. 


1.81 {[(P) > (Q)] (Q)} => [(Q) => (P)] => (P)} is a triadic tautology in 
Lukasiewicz’s system, but not in Church’s. 


183 {(P) >= [Q > ®]} > tM) = @] = [P) > (A) is a triadic 


tautology in Church’s system, but not in Lukasiewicz’s. 
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1.85 {(P) > [(2) > (R)]} > {(@) = [(P) => (R)]} is a triadic tautology in both 
Church’s system, and in Lukasiewicz’s. 


1.87 


- [-(P)] > (P) hypothesis, 
+ (P) => (P) theorem 1.14, 
-+ P deduction rule. 


1.89 In the proof of {{[-=(P)] => (P)} F (P), the first step lists the hypothesis H, here 
+ {[-(P)] = (P)}. The Deduction Theorem replaces this step, / H, by a complete 
proof of (HW) => (A), from theorem 1.14, with [=(P)] = (P) substituted for H 
everywhere. The second step invokes theorem 1.14, which the Deduction Theorem 
replaces by a complete proof of theorem 1.14. The third step uses the deduction rule 


+ [=(R)] = (S) hypothesis, 
- (R) => (S) theorem 1.15, 
- § deduction rule, 


which the Deduction Theorem replaces by a complete proof of this deduction rule. 


1.95 The commutation law establishes the logical equivalence 


{(P) > Q)] = [(Q > @Mi} > @) > _() ] 
=—_e=__- eer oeo——-—-—-—-” ——" —— 
K L 


H 


¢ 


(2) >[l/) > Ql>lQ => @}=> |. 
= —— 


(R) 
eed 
H L 
To prove either formula with the Deduction Theorem, this proof starts by assuming 
that the hypotheses H and K, here Q and [(P) > (Q)] => [(Q) => (R)], are True. 


F [(P) > (Q)] > [Q) > ®)] 

FLQ> {P) > OF (@ > @M}) > KQ@ > (/) > O@} > {@ > [@ => ()}) 
F {(0) > [(P) > (O)}} > {) > [(Q > 

F (Q) > [(P) > ()) 

F (Q) > [(Q) = (R)] 

FO 

FO=a 

reo 

ER 


The result then follows from the Deduction Theorem. 


1.97 The formula S defined by {[=(P)] = (P)} = (P) has only one propositional 
variable, P. Moreover, S has the form (V) => (W), with [=(P)] => (P) for V, 
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and P for W. Thus the first step consists in applying the Provability Theorem 
(theorem 1.125) to prove P’ FS’. 


PTrue_ If P is True, then so is W, and the single line | P constitutes a proof 
of | W. Hence a copy of the proof of theorem 1.12 forms a complete proof of 
P+ (V) > (W), whichis PFS’. 

P False If P is False, then W is also False, but so is V. Thus, V’ is —(V), which 
is —{[-(P)] = (P)}. In this case the Provability Theorem calls for a proof of 
P’ + V’. However, V is False and has the form (H) => (K), with =(P) for H, and 
P for K. Hence the Provability Theorem calls for proofs of P’ + H’ and P’ F K’, 
which are here [—(P)] - [-(P)] and [=(P)]  [-(P)]. In both cases the proof 
of [=(P)] - [7(P)] is a substitution in the proof of theorem 1.14. Hence follows 
a proof of P’ | (A’) A (K’), which is here [=(P)] - [=(P)] A [7(P)], and, by 
definition of A the same proof shows that [=(P)] + —{[-=(P)] = (P)}, which is 
[=(P)] - [-(V)], or, equivalently, P’ F V’. 

Thence follows a proof of P’ F {[-(W)] = [-(V)]} and hence by contraposition 
a proof of P’ F [(V) => (W)], which is again P’ F S’. 


1.99 The law of reductio ad absurdum, S, 


[(P) > Q)] > (K(P) > FO) = [-))). 


has the form (V) = (W), with (P) = (Q) for V, and {(P) > [-(Q)]} > [-()] 
for W. 


P True, Q True If P is True and Q is True, then W is True. However, W has the 
form (H) => (K), with (P) > [-(Q)] for H, which is False, and =(P) for K, 
which is also False. Hence the Provability Theorem calls for a proof of P’, Q’ + 
H', here P,Q + —(A). Because H has the form (P) = [=(Q)], and hence —=(H) 
has the form (P) A {=[=(Q)]}, the Provability Theorem calls for proofs of P,Q F 
P and P,Q Q, which follow from substitutions in the proof of theorem 1.14: 


F (P) = (P) theorem 1.14, 

F (Q) > (Q) theorem 1.14, 

F (P) > {(Q) => [(P) A (Q)]} theorem 1.82, 

+ (P) > [(Q) > {-[(P) > [-(Q)]}] definition of A, 

F (P) > {(Q) => [-(D)]}} substitution, 

F (P) = [(Q) > {[7(K)] = [-(4)]}] axiom P1 and theorem 1.16, 
F (P) = {(Q) = [(A) = (K)}} axiom P3 and theorem 1.16; 
r (P) > [(Q) > (W)] substitution, 

F (W) => [(V) > (W) axiom P1, 

F (P) > {(Q) => [(V) > (W)]} theorem 1.32, 


F (P) => [(Q) => (S)] substitution. 
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P False If P is False, then W is True, regardless of whether Q is True or False, and 
in either case the same proof of theorem 1.14 gives a proof of [=(P)] > [-(P)], 
whence a proof of [—(P)] => (W), and hence a proof of [-=(P)] > [(V) > (W)], 
which is [=(P)] => (S). 

P True, Q False If P is True but Q is False, then W is False. Because S is a 
tautology, V is also False. Hence the Provability Theorem calls for a proof of 
P’,Q’ + V’, here P,[=(Q)] / —(V). Because V has the form (H) => (K), the 
Provability Theorem calls for proofs of P’,Q’ | H and P’,Q’ | [-(K)], here 
P,[=(Q)] - P and P,[=(Q)] - [7(Q)], both of which follow from substitutions 
in the proof of theorem 1.14. Thence the proof of theorem 1.82 gives a proof of 
(H) A [=(K)], which is -(V): 


F (P) = (P) theorem 1.14, 
F [=(Q)] => [F(Q)] theorem 1.14, 
F (P) = ([-(Q)] > {(P) A [-(Q)}) theorem 1.82, 
+ (P) > ([-(Q)] > {-[(P) > (Q)]}) definition of a, 


F (P) > {[-(Q)] => [->™)]} substitution, 
+ (P) = ([-(Q)] > {[-(W)] > [-(V)]}) axiom P1 and theorem 1.16, 
F (P) > {[-(Q)] > [(V) > (W)]}} axiom P3 and theorem 1.16. 


From the proofs of (P) = [(Q) => (S)] and (P) > {[-=(Q)] => (S)} follows a proof 
of (P) = (S), and thence from the proof of [=(P)] = (S) follows a proof of S. 


1.101 The propositional form {[(P) > (Q)] > (R)} > {[(R) > (P)] => (P)} has 
the form (V) > (W). 


P True If Pis True, then axiom P1 gives a proof of (P) => (W), and hence a proof 
of (P) => (S). 

P False, R False If P is False, then (P) = (Q) holds, by the law of denial of the 
antecedent (theorem 1.40): 


F [>(P)] = [) > (Q)]. 


If R is also False, then [—(R)] => [-(R)], whence [(P) => (Q)] A [-(R)] holds, 
whence also —{[(P) = (Q)] = (R)}, which is =(V). 

P False, R True If P is False and R is True, then [(R) > (P)] is False, but so is P, 
whence [(R) = (P)] => (P), which is W, is True. 


1.103 The propositional form [{[(P) (R)]| > (Q)} > (Q)| > {[(2) > (R)] 
[(P) => (R)]} has the form (V) > (W). 


RTrue If Ris True, then axiom P! gives a proof of W, whence a proof of S. 
R False, P False If P is False, then the proof of [-(P)] > [(P) > (R)] gives a 
proof of W, whence a proof of S. 
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R False, P True, Q True _ If Q is True and R is False, then the definition of A gives 
a proof of (Q) = ([=(R)] > {-[(Q) => (R)]}) and hence a proof of W, whence 
a proof of S. 

R False, P True, Q False With R False, P True, Q False, the definition of A gives 
a proof of (P) = ([-(R)] > {-[(P) = (R)]}), and hence also a proof of 
{{(P) => (R)]} => (Q), whence a proof of =(V), because Q is False and V is 
[{[(P) > (R)]}} > (Q)] > (Q). 


1.105 The propositional form U defined by 


[R) > Q] => [S) > ME > 1) > )] = [6) > MP 


has the form (V) => (W). 


P True If P is True, then axiom PI gives a proof of (P) => [(S) => (P)], whence 
a proof of (P) = (W), and hence a proof of (P) = (U). 

P False, S False If S is False, then the law of denial of the antecedent, [=(S)] > 
[(S) = (P)] gives a proof of [=(S)] = (W) and hence of [=(S)] = (U). 

P False, S True, R False With P False, S True, R False, the law of denial of the 
antecedent gives [—(R)] = [(R) => (Q)], while (S) = (P) is False, whence V is 
False, and then the law of denial of the antecedent gives (V) = (W), which is U. 

P False, S True, R True With P False, § True, R True, (R) => (P) is False, whence 
the law of denial of the antecedent gives [(R) => (P)] > [(S) > (P)], which is 
W;; hence the law of denial of the antecedent gives (V) = (W), which is U. 

P False, S True The foregoing two cases give a proof of [=(P)] = [(S) > (U)]. 

P False The proofs of [=(P)] > [(S) > (U)] and [-(P)] > {[-(S)] > (VY)} 
then combine into a proof of [—(P)] > (UV). 


Finally, the proofs of (P) = (U) and [=(P)] = (U) combine into a proof of U. 


Exercises from Chapter 2 


2.1 AX{VY[-(X € Y)]}. 

2.3 AX{(X € A) A [A(X € B)]}. 

2.5 AX({(X € C)A[A(X € A)JA[H(X € BY} V {[>(X € CAL € A) V(X € B)]}). 
2.7 VX[(X € A) > (X € B)]. 

2.9 VX{(X €C) > [(X EA) A (X € B)]}. 

2.11 Theorem 2.46 establishes the equivalence [AX(P)] <> (—{VX[=(P)]}). 
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2.13, Axiom Q2 is a theorem derivable from Margaris’s and Rosser’s axioms: 
F {WX[(P) > (Q)]} = tIVX(P)] = [VX(Q)]}_ axiom A4, 
- (P) => [VX(P)] axiom A6, no free X in P, 
F {WX[(P) > (Q)]} > t(P) = [VX(Q)]} derived rule. 


2.15 Axiom Q4 follows from the abbreviation -{VX[-(P)]}, double negation and 
theorem 2.45: 

F [VX(P)] @ {VX[-=-(P)]} theorem 2.45, 

F {3[VX(P)]} > (-{VX[--(P)]}) contraposition and its converse, 

F {4x[-(P)]} > [=(VX{-[-(P)]}) | abbreviation, (R) > (R), 

F {Ax[-(P)]} @ {-[VxX(P)]} transitivity. 


2.17 Kleene’s 3-rule is derivable from the rules of inference with axioms Q1— Q4 
and the propositional calculus. 


F (P) = (Q) hypothesis, 

F [(P) = (Q)] => {[-(Q) = [-@)]}  contraposition, 

F [>(Q) = [-(P)] Detachment, 

F VX{[=(Q) => [A(P)]} Generalization, 

F [=(Q)] > {VxX[-(P)]} theorem 2.29, 

F {VX[A(P)]} => {-|5x(P)]}} axiom Q3, 

F [-(Q)] => {-[2x(P)]} transitivity, 

F [AX(P)] > (Q) converse contraposition 


and Detachment. 


2.19 Kleene’s 3-schema is derivable from the rules of inference with axioms 
Q1- Q4 and the propositional calculus. 


+ {VX[-(P)]} > {Subfy[-(P)]}} axiom QI, 

+ {Subfy[-(P)]} > {-[Subf}(P)]} remark 2.20, 

+ {VX[=(P)]} = {-[Subf}(P)]} Detachment, 

+ [Subf}(P)] > (={VX [-(P)]}) contraposition and double negation, 


F (={[VX[-(P)]}}) > AX(P)] theorem 2.46, 
+ [Subf}(P)] > [AX(P)] transitivity. 

221 
F (V) = (U) hypothesis, 
F[(V) & (U)] > [(V) > (UV) theorem 1.62, 
r (V) > (VU) Detachment, 


F[V) > U)] = UY) => (W)] = [(V) = (W)]} transitivity 
(theorem 1.27), 
F [(U) > (W)] > (VY) > W)] Detachment. 
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2.23 

F (V) = (U) hypothesis, 

- [(V) & (V)] > (UV) = (V)] theorem 1.62, 

Fr (U) => (V) Detachment, 


F [(U) = (V)] = {[(V) = (W)] => [U) > (W)]} transitivity 
(theorem 1.28), 
F [(V) = (W)] [((U) > (W)] Detachment. 


2.25 Theorem 2.45 shows that if (V) = (U), then (P) > (Q). 


2.27 If Fr (V) <= (U), then F [-(V)] & [A(U)], by contraposition and 
transposition. 


2.31 
{3X [(P) v (Q)]} <> {AX(P)] v [AX (Q} > {Ax(Q)] v [AX (P)]} 
<> ({3X [(Q) v (P)]}) 
2.33 
tAX[(P) v (P)} <> {AX(P)] v [AX(P)]} <> [AX(P)] 
2.35 


(VX {[(P) A (Q)] v (R)}) oe 
(WX {[(P) v (R)] A [(Q) v (R)]}) ° 
({VX[(P) v (R)]} A {VX[(Q) v (R)]}) 


2.37 


[Ax(Q)] + (AX{-[-(Q)]}) 

& (={VX[-(Q)]}) 

© [-(¥X {VX[-(Q)]})] 
© [=(¥X {- [AX(Q)]})] 
© [-(- {4x [2x(Q)]}) | 
> {4X[4x(Q)]} 


2.39 The implication 


tVX[(P) Vv (Q)]} = t(P) v [VX(Q)]} 
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is theorem 2.79. For the converse, 


t (P) => [VX(P)] axiom Q1, 

F [VX(Q)] > [Vx(Q)] theorem 1.14, 
F {(P) v [VX(Q)]} => {[VX(P)] v [VX(Q)]} theorem 2.39, 
F {[VX(P)] Vv [VX(Q)]} => {VX[(P) v (Q)]} theorem 2.77, 
F {(P) v [VX(Q)]} => {VX[(P) v (Q)]} theorem 1.16. 


2.41 Invoke the reflexivity of the logical implication (theorem 1.63): by definition 
of Z, 


[2(A, B)] > (WX{[E(X, A)] = [(X, A)]}). 


which is in prenex form, and its matrix is an instance of theorem 1.63: (P) => (P). 


2.43 Invoke the transitivity of the logical implication (theorem 1.65): 


F [&(X,A)] => [@(X, B)] specialization of the hypothesis&(A, B), 
F [&(X, B)| > [E(X,C)] specialization of the hypothesis&(B, C), 
F [&(X,A)] => [E(X,C)] transitivity of the implication (theorem 1.65), 


whence the conclusion &(A, C) follows by Generalization. 
2.45 Invoke the reflexivity of the logical implication (theorem 1.63): by definition 
of &, 

[W(A,A)] > (WX{[E(A, Y)] => [&(A, Y)]}). 


which is in prenex form, and its matrix is an instance of theorem 1.63: (P) => (P). 


2.47 The equality predicate .% defined as in example 2.85 for set theory is 
reflexive, as proved by the solutions to exercises 2.41 and 2.45, so that / satisfies 
axioms _Y 1. Also, the set theory described in example 2.85 has only one predicate, 
&, and formula (2.2) shows that -¥ satisfies axiom 92. 


2.49 Axiom 1 from subsection 2.5.3 coincides with axiom .71 from subsec- 
tion 2.5.1. Axiom _%2 from subsection 2.5.3 is a special case of axiom .72 from 
subsection 2.5.1, which allows for atomic formulae as particular cases of P and Q. 


Exercises from Chapter 3 


3.1 Negate the definition of the empty set: 4X(X € S). 
3.3 Negate the definition of supersets: X[(X € B) A (X ¢ A)]. 
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3.5 By definition of the empty set (axiom S2), the formula =(X € @) is universally 
valid. By specialization, with @ substituted for X, it follows that =(@ € @) is True, 
whence @ € @ is False, by definition of False. 


3.7 Use substitutions in the axiom of extensionality and the definition of equality. 


3.9 For each set $, @ C S, by theorem 3.11. Consequently, @ is a subset of ©. 
Conversely, if S is a subset of @, so that S C @, then S = @ by theorem 3.10: 


+ SC __ hypothesis on S, 
-@CS§ theorem 3.11, 
-/+S=@ theorem 3.10. 


3.11 This proof proceeds by contraposition, showing that if § # @, then S has a 
subset different from S. Because @ C S by theorem 3.11, it follows that if S # @, 
then S has a subset, @, different from S. 


3.13 If S is a subset of every set, then S is a subset of the empty set: S C ©. 
Moreover, @ C S by theorem 3.11. Consequently, S = @, by theorem 3.10. 


3.15 IfA © Band B ¢ C, thenA C Band B C C, whence A C C, by theorem 3.9. 
However, because A ¢ B, there exists some Z € B such that Z ¢ A. Consequently, 
Z €C but Z ¢ A, so that A # C, whence A € C. 


3.17 This proof establishes each implication (= and <) independently. 
For one implication, assume VY[(A C Y) & (BC Y)]. 


EVY[(A CY) (BCY)] hypothesis, 


aT Subfy, 

t (ACA) & (BCA) 

al >, 

t (AC A) => (BCA) 

FACA theorem 3.8, 
-F/BCA Modus Ponens; 


EF VY[(A CY) (BCY)] hypothesis, 


4 Subf,, 

FF (ACB)S (BCB) 

al =>, 

F(A CB) <= (BCB) 

-/BCB theorem 3.8, 
-FACB Modus Ponens; 
-FA=B theorem 3.10. 


For the converse, assume F A = B, and begin with any superset Y. 
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FACY hypothesis, 
t VX[(X € A) > (X € Y)] definition of subsets, 
EFBCA hypothesis, 
F VX[(X € B) > (X €Y)] _ transitivity of >, 
FBCY definition of subsets. 


Swapping A and B yields the converse. Hence VY[(A C Y) & (BC Y)]. 


3.19 If C D> Dand D D W, then D C C and W C D, whence W C C-. For the 
converse, let W := D, whence D 2D D, the hypothesis (D > D) = (C 2D D), and 
Modus Ponens yield C D> D. 


3.21 Let X := ©, Y := {@},Z:= { {O}}. 
3.23 Let X := © and Y := { {@}}. 
3.25 Let X := {O} and A := { {O}}. 


3.27 (X € {G}) = (X = @) whereas (X € {{@}}) > (X = {O}). Yet B F {O}. 
Consequently, {@} and { {@}} have different elements. Therefore {@} 4 { {@}}. 
3.29 {@} © {G, {Oh}. Yet (X © {@}) > (X = @) and @ F {@}, whence 
{O} € {Oj}. Hence {@} F {S, {O}}. 

3.31 From VS(S C S) specialization with S := { {@} \ gives { {@} G { {@} }. 


3.33 The set { {@} \ has only one element {@}, which is also an element of the set 
{ 2, {}}. Hence { {2} } C { @, {oH}. 


3.35 For theorem 3.13, the word “and” in the informal proof corresponds to 
the logical connective Vv in the formal proof through the universal quantifier, 
specializations with H first and then K, and the axioms of extensionality and pairing, 
along the following outline: 


F(X =H) => [(X=H)v(X=K)] (P) => [(P) Vv @)], 
F(X =A)v (X=kK)]> (XeEL) pairing, 


Fk (X =HA)> (XeEL) transitivity, 

F(X =H)=> [X€L)> (HEL)  extensionality, 

F(X = H)=> (HEL) transitivity, 

+ (H=H)> (HEL) specialization Subf*, 
-FH=H extensionality, 

FHeEL Modus Ponens; 

-FKeEL as for H; 

L (HEL) A(K EL) (P) > {(Q) > [(P) AO). 


3.37 If A = B, and if S C A, then each element of S is also an element of B, whence 
S CB. Thus A(A) C M(B), and conversely with the rdles of A and B switched. 
If A(A) = A(B), and if X € A, then {XxX} € P(A), whence {X} € A(B), so that 
{X} C B, and hence X € B. Thus A C B, and conversely B C A with the réles of A 
and B switched. Therefore A = B. 
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3.39 {{S}\}N{, {o}} = { {oy}. 

3.41 { {}}U{@, {o}} = { S, {O} } in the superset { , {@} } 

3.43 VS[(SN @) = @]. 

3.45 By definition, A\ @ = {X € A: X ¢ @} where X ¢ @ holds for every 
set X. Hence VA(VX{(X € A) < [(X € A) A (X ¢ @)]}) whence A = A \ @ by 
extensionality. 

3.47 By definition, A\ B= {X € A: X ¢ B}. Thus (A\ B) = @ if and only if X ¢€ B 
fails, and hence X € B holds, for every X € A, which is the definition of A C B. 


3.49 The formula VAVB[(A \ B) = (B \ A)] is False. Indeed, with A := @ and 
B := {@}, it follows that 


A:= @, 
B= {9}, 
A\B= @\ {@} 

2) 

{OD} 
{S}\ @ 
= B\A. 


Il 4K Il 


451. (2,3,7) 13,5, 7} = (2,3,5, 7). 

553 (2,3,7)9 3,5, 7) = 3.7). 

355 (2,3, AG, 5,7} = (2, 5}. 

3.57 (JO =@. 

3.59 Uf, {2}} ={o} #{ @, {o}}. 

3.61 {2} € { , {o}} but {9} ¢ {9} = Ul @, {o}}. 


3.63 
X:= {G}, 
A:= { {2}}: 
Y= { {o} }, 
B= { { {oy}: 
xXxUY = {g@, {2h}, 
AUB = | {2} {1o3}3 
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3.65 


(xe Uta}) & Grr 4) awe YH 
© {ay[(¥ =A) A(X Y))} 
> [AY(X € A)] 
@oXead. 


3.67 Use the tautology {[(P) Vv (Q)] A (R)} = {[(P) A (R)] Vv [(Q) A (R)]}- 
3.69 Use the tautology [(P) v (F)] > (P): 


[X € (AU S)|] > [X € A) V (XE @)] 
oxXea. 


3.71 Use the tautology [(P) v (P)] = (P). 
3.73 Use de Morgan’s second law: 


{X € [U \ (AN B)}} @ [(X € U) A {[X € (AN B)}] 
> [(X € U) A {-[(X € A) A (X € B)}}] 
{XE U)A[X EA)V (XX EB} 
= {[(X e U) A(X EA) V [XK € U)A (& €B)]} 
= {[(X « (U\A)] v X € (U\ B)} 
= {X © [(U \ A)cup[U \ B)]} 
nities Use the tautology (B) Vv [=(B)] and the contradiction (B) A [=(B)], with X € U 
or BD: 
{X € [(A\ B) \ U}} 
& {IX €(A\ BA [>(X € U)} 
> {(X € A) A {-[X € BD} A [> € UY} 
> ({(X € A) A [AX € UD} A [7X € B))) 
@ {[X € (A\ UA ([>X € B)] V {(X € U) A [7X € U)}})} 
& [{(K €A) A [>(X € UD A ({[7(X € B)] V(X € UF A {IX € BD] V [>(X € UD}})] 
@ [{(X EA) A [7X € UD} A {IX € B)]}} A {XK € B)] V [>(X € U)}})] 
@ [{(X €A)A [XK € UN} A (> € B)] v [> € U)})] 
@ [{(X €A)A [F(X € UD} A ({€(X € B) A [> € U))})] 
(K€ (A\ UDA {1X € (B\ UD)}) 
© {X € [(A\ U) \ (B\ U)}}. 
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ce i 


{X € [A UB) \ U]}} @ {X € (AUB)A [7X € V)]} 
> {[(X € A) V(X EB) A [->(X € U)}} 
 ({[(X € A) A [> € U)]} v {[(X € B) A [-(X € U)]}) 
> {X € [(A\ U) U (B\ UDI}. 


3.79 


(Xe [UN (A\ By} 

> {(X €U)A [X € (A\ B)]} 

> ([(K € U) A{(X € A) A [>(X € B)}}) 

> [{[& € U) A [7X € BD} A (X € A) | 

> {[{[&X € U) A [> € By} A(X € A)] V [KIX € U) A [> € BI} A [> € U)I]} 
> ([{[& € U) A [> € B)}}] A {(X € A) A [>(X € U)]}) 

<> ([{[& € U) A [A(X € B)}] A [>In € A)] v X € U)})) 

> {X € [(U\ B)\ (U\ A)}}. 


3.81 If CC (ANB), thenC CA andC CB. 


3.83 

A := {9}, 
B:= { {23}, 

AUB:= {@, {oH}, 
C:=AUB, 
C € PA(AUB), 
C ¢é P(A), 
C ¢é PA(B), 


C ¢ AAU PAB). 


3.85 @ € P(A \ B) but @ € P(B) whence @ ¢ [P(A) \ A(B)]. 
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3.87 


(UF) B= User (AN B) 
t 
WX {IX € (UF) NB] + [X € Use (AN B)|} 


t 
VX ({[X € (U F)] A (X € B)} & [BA {(A € -F) A [X € (AN B)]}}]) 


VX ({[HA(A € F) A(X EA) A(X E€ B)} 
> [HA {(A € F) A [(X € A) A (X € B)]}]) 
which is a tautology by associativity of A. 
3.89 AAA = (AUA)\ (ANA) =A\A=@. 
3.91 AAB = (A UB) \ (ANB) = (BUA) \ (BNA) = BAA. 
3.93 Yes, (AAC) U (BAC)] > [(A U B) AC]. 


(AU B)AC = [(AUB) UC]\ [(AUB) NC 
= [((AUB)UC]\ [(ANQU(BNO) 
= [((AUBUO)\ (ANC N[AUBUC)\ (BNO) 
= {(AUBUC) \ A]U[(AUBUQ)\ CH} N{[(AUBUO) \ B] 
U[(AUBUC)\ C} 
SF {(AUC)\AJU[AUC)\C}NT[BUC)\ B]ULBUC)\ CH} 
€ {[(AUC)\AJU[AUC)\ CHU (BUC) \ B]U[BUC)\ CH} 
= [AUC)\ANO]U[BUC)\(BNO)] 
= [(AAC) U (BAO)]. 


3.95 Yes, [(AAC) N (BAC)] > [(AN B)ACI. 


(AN B)AC = [(ANB)UC]\ [(ANB)NC 
= {(ANB)UC]\ ANB} U{ANB)UC]\C} 
= {AN B)UC]\ASU {[(ANB) UC] \ BYU ANB) UC] \ C} 
S {(AUC)\AJULAUC)\C}N{UBUC)\ B}ULBUC) \ C} 
= [AUC)\ANO]N[(BUC)\ (BNO) 
= [(AAC) N (BAO)]. 


3.97 Yes, [(AAC) \ (BAC)] 2 [(A \ B) AC]. 
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3.99 No, [(AU C)A(BU C)] Z [(AAB) U Cl, because the left-hand side does not 
contain (A U C) NM (BU C) while the right-hand side contains all of C. 


3.101 Yes, [((AN C)A(BN C)] > [(AAB) N C]. 

3.103 Yes, [(A \ C)A(B \ C)] 2 [(AAB) \ C]. 

3.105 No, [(C \ A)A(C \ B)] Z [C \ (AAB)], because the left-hand side does not 
contain C \ (A U B) while the right-hand side contains all of C \ (A U B). 


3.107 No, [P(A)AA(B)] Z [A(AAB)], because the left-hand side does not 
contain @ € P(A) N A(B) while the right-hand side contains @ € A(AAB). 


3.109 


(AAB) 0 (ANB) = [(AUB) \ (ANB)|N (ANB) 
=@; 

(AAB) U (ANB) = [(AUB) \ (ANB)|U (ANB) 
= AUB. 


3.111 As defined here the Cartesian product of sets is not associative (but a slightly 
different version of the Cartesian product is associative). For example, (Ax B)xC # 
A x (B x C) for the sets A = {0}, B = {1}, and C = {2}, because 


(A x B) x C = {((0, 1), 2)} = O, D}. 10, D, 23 
Ax (Bx C) = {(0, (1, 2))} = {{{0}, {0, (1, 2)} 3} 


which reveals that {0} € (0,(1,2)) € A x (B x C) whereas {0} ¢ ((0,1),2) € 
(A x B) x C. Consequently, (0,(1,2)) 4 ((0,1),2) by extensionality, and then 
(AxB)xC # Ax (Bx C) again by extensionality. (A slightly different definition of 
the Cartesian product through functions of integers will later provide an associative 
Cartesian product.) 


3.113 (The last step still requires further symbolic substeps.) 
(A= @)V (B=2@) 

¢  extensionality, 
{BX (X € A)]} Vv {-AY(Y € B)]} 
¢ de Morgan’s Law, 
—{[AX(X € A)] A [AY(Y € B)]} 
{definition of A x B, 
={SXAY[(X, Y) € (A x B)]} 
¢ uniqueness of 2. 
(Ax B)=@ 


3.115 The Cartesian product distributes over unions: for all sets A, B, and C, 


[(A x B) U(C x B)] = [A UC) x B]. 
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An informal proof can establish that [(A x B) U (C x B)] and [(A U C) x (B)] have 
exactly the same elements. These two sets are Cartesian products, and, consequently 
their elements are ordered pairs. 


¢ An ordered pair (X, Y) is an element of (A x B) U (C x B) if and only if (X, Y) € 
(A x B) or (X, Y) € (C x B); 

e hence (X, Y) € [(A x B) U(C x B)] if and only if X € A and Y € B, or X € C and 
YeB, 

e which is equivalent to X € A or X € C, and Y € B; 

e thus (X,Y) € [(A x B) U (C x B)] if and only if X € (AUC) and Y € B, which 
is equivalent to (X, Y) € [(A UC) x B]. 


Just as the preceding informal proof concatenated the two occurrences of Y € B into 
one such occurrence, a formal proof can rely on the distributivity of A over V by the 
tautology 


tP) A (R)] v [(Q) A (RY <> ALP) Vv (Q)] A (R)}. 


(X,Y) € [(A x B) U(C x B)] 


a 


definition of union, 
[(X, Y) € (Ax B)] v [(X, Y) € (Cx B) 


P= 


definition of Cartesian products, 
(Xx eA AV EB] V(X ECA EB) 


= 


distributivity of A over Vv, 
[(XEA)V(XEO]AWE 


oBe 


definition of union, 
[X € (AUC)] A [(Y € B) 


= 


¢ definition of Cartesian products. 
(X,Y) € [((AUC) x B] 
3.117 
[A x (B\ D)] = [(A x B) \ (A D)] 
+ 


VXVYE(X, Y) € [A x (B\ D)] & (X,Y) € [(A B) \ (AX D) 


= 


VXVYE{[(X © A) A (¥ € B) A {71Y € D)}]} & [(X € A) A(Y € B) A {7[(X EA) AY € DD} 


VXVYE[(X EA) A (Y € B) A {71Y € D)}]}  [(X EA) A (Y EB) A {[F(X € A)] V [FU € DY} 


~~ 


VXVYE[(X € A) A (Y € B) A {-(Y € D)}]}} & [(X € A) A (Y EB) A {-1Y € D)$] V [(X € A) A (Y EB) A {7(X € A)} 


(P) = [(P) v (False)] 


VXVYE{[(X © A) A (¥ € B) A {71 € D)}]} > [(X EA) A (VY EB) A {71 € D3} 
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3.119 No, [(A \ C) x (B\ D)] 2 [(A x B) \ (C x D)], because the right-hand side 
contains all of (A \ C) x B. 


3.121 No, [((AAC) x (BAD)] Z [(A x B)A(C x D)]. For instance, if B = D, then 
BAD = @, whence (AAC) x (BAD) = @ on the left-hand side. Yet on the right- 
hand side, still with B = D, if C = @, then Cx D = @, whence (Ax B)A(Cx D) = 
(A x B). 


3.123 No, ([A(A)] x [A(B)]) Z A(A x B), because the left-hand side consists of 
pairs of subsets of A and B, whereas the right-hand side consists of subsets of A x B. 
For instance, if A = @ = B, then A x B = @, whence Y(A x B) = A(@) = {B}, 
whereas [(A)] x [P(B)] = [P(S)] x [P(@)] = {St x tS} = {(G, O)}. 


3.125 If S denotes the relation of strict inclusion on A := A(A), then 
So! = {(V,W) € P(A) x AH): WE V}. 
3.127 From @ C A and @ C B it follows that @ x @ € P(A) x A(B), and also 


@=OxOGCOx OS. Thus We |. 


3.129 IfA = @ = B, then P(A) = A(G) = {@} = J, because P(A) x A(B) = 
P(G) x P(W) = {D} x {G} = {(W, B)}. From the solution to exercise 3.127, it 
follows that Y(A) = {@} is all of F. 


3.131 No, F is not a function, because it contains two pairs, (0, 1) and (0, 4), with 
the same first coordinate but different second coordinates. 


3.133 Yes, R is a function. 
3.135 Yes, Z is a function (in effect the zero function). 
3.137 Yes, the empty set @ C @ x Bis a function from @ to B. 


3.139 For each X € B, 14|, (X) = 1 = 7a(X), so that 14|, coincides with yp(X) 
on B (but not on A). 


3.141 If V 4 W, then yy U yw is not a function, because of multiple values on 
VAW, where one of xy or xw has the value 0 while the other has the value 1. 
Therefore, yy U yw does not coincide with the function xyvuw. 


3.143 Let 


= {O}, 

= {{2}}, 

:= RUS, 

= {O}, 

>: ASB, 

= {(G, 2), ({2},@)}. 


a 7 BW FH DD 
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Then 


RNS =@, 
F"(RNS) = F"(@) =@, 
F"(R) = F'"({@}) = {F(@)} = {2} = B, 
F"(S) = F"({{O}}) = {F({O})} = {O} = B, 
F"(R)O F'(S) = BOB=B#F'"(RNS). 


3.145 For all sets and for every function, [F"(K)] \ [F"(4)] C F'"(K \ A). 


3.147 For each X € A, from {X} C A it follows that F"({X}) C B, with F"({X}) = 
{F(X)} by definition of F". Consequently, for each X € A, the pair ({X}, F"({X})) € 
P(A) x Y(B) corresponds to the pair (X, F(X)) € F CAB. 
3.149 F"({o, {2}}) = {F(@), F(@})} = {F(0), F()} = 2}. 

Fo — 1"({@, {}\) = Fo — 1"({0, 1}) = {2}. 

F({@, {@}}) = FQ) =0=@. 

Fo— "({{2, {o}}\) = Fo—1"({23) = {0, It. 
3.151 For each X € A, (Ip o F)(X) = Ip[F(X)] = F(X). 
3.153 @CFoO®COSxBCS. 


3.155 Define a function G: F"(A) — A as follows. For each Y € F'"(A) there exists 
exactly one X € Y(F) CA with Y = F(X); define G(Y) := X. Then G[F(X)] = X, 
so that Go F = Igr), with A(G) = F"(A) CB. 


3.157 Let 
A:= {2, {oH}, 
B:= {@}, 
F := {(@, 2), ({@}, @)}, 
G:= {@,2)} 


Then (F 0 G)(@) = @, so that F o G = Jz, and hence G is a right inverse for F. Yet 
F has no left inverse, because F is not injective. 


3.159 Let 
A:= {0,1}, 
B := {0, 1, 2}, 
F := {(0,0), 0, D}, 
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G := {(0,0), (, 1), (2, 0)}, 
H := {(0,0), (1, 1), (2, 1}. 
ThenGoF=Ih,andHoF=I,. 


3.161 The empty relation is vacuously reflexive, symmetric, and transitive because 
(X, Y) € @ is False> 
Reflexivity: 


VX{(X € B) => [(X,X) € @]} 


is True (a theorem) by the tautology (theorem) (False) => (P). 
Symmetry: 


YXVY{[(X, Y) € @] => [(Y,X) € @]} 


is True (a theorem) by the tautology (theorem) (False) > (P). 
Transitivity: 


WXVYVZ({[(X, Y) € @] A [(Y,Z) € @]) = [(X,Z) € o]} 


is True (a theorem) by the tautology (theorem) (False) => (P). 


3.163 The relation of strict inclusion is vacuously anti-symmetric, because the 
hypothesis (V & W) A (W G V) is False for all sets V and W: 
VEW)AWEYV) 
¢ definition of S, 
(VOW) AVEWIAIWCVAWFY)] 


¢ definitions of C and =, 


{VX[(X EV) > (XE WS A {4Y[(Y EW) A (VY € VD} 
A{VX[(X € W) > (XE V)]} A {AZ[(Z € V) A (Z € WY} 


¢: [AY(P)] > {t-VYI-@M) 


{VX[(X € V) > (XE WI} A {-VY-A[Y eW) AY € VDI} 
ALVX[(X € W) > (X € VN} A {-VZ-[(Z € V) A (Z € W)I} 
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¢t de Morgan’s Law, 


tWx[(X EV) > XK EW) A {>VY (> € W)] v [> € VD} 
A(WX[(X € W) = (XE VY} A{-VZ((4Z € V)] v [-(Z € W))} 


t: [(P) > (Q)] + tI-M)] v Q)}. 


{VX[(X € V) > (XE W)R A{-VY[(Y € W) > (Ye V)]} 
ALVX[(X € W) > (XE VD} A {-VZ[(Z € V) > (Ze W))} 


¢} substitutions, 


{VX[(X €V) > (XE W)P A {-VY[(Y € W) > (Ye V)]} 
A{VY[(Y € W) > (YE VDI} A {[-VX[(X € V) > (X € W))} 


¢: (P) A [A(P)] is False, 


False 
Consequently, with (V S W) A (W G V) False, the implication 


(VEW)AWESV) > V= WwW) 


is True. 


3.165 Here are the equivalence classes: 


A/# = <0], L]}. 
[0] = {0,2, 43, 
[1] = {1,3, 5}. 


3.167 Verify that (|) % = B and that the elements of ¥ are pairwise disjoint: 


|_) F = {0,2,4, 6} U {1,3, 5, 7} = {0, 1,2,3, 4,5, 6, 7} =B, 
{0, 2, 4, 6} 9 {1,3,5,7} = @. 


Here is the corresponding equivalence relation: 
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(1,7) (3,7) (5, 7) (7,7) 
(0, 6) (2,6) (4, 6) (6, 6) 
(1,5) (3,5) (5, 5) (7,5) 
Bee (0, 4) (2,4) (4, 4) (6, 4) 
- (1,3) (3, 3) (5, 3) (7, 3) 
(0, 2) (2, 2) (4, 2) (6, 2) 
(1, 1) (3, 1) (5, 1) (7, 1) 
(0, 0) (2,0) (4, 0) (6, 0) 


3.169 Outline: For each equivalence relation &, the partition Yg consists of all the 
equivalence classes corresponding to &. The relation #z, then consists of all the 
pairs from all the equivalence classes of @, which is then # again. 


3.171 The relation 2 is 

antisymmetric, vacuously, because it does not contain any two pairs (X, Y) and 
(Y,X), 

asymmetric, because it does not contain any two pairs (X, Y) and (Y,X), 

not connected, because (2,9) ¢ 2 and (9, 2) ¢€ J, 

irreflexive, because it does not contain any element on the diagonal, 

not reflexive, because (0,0) ¢ 2, 

not strongly connected, because (2,9) ¢ 2 and (9,2) ¢ 2, 

not symmetric, because (0, 1) € @ but (1,0) € J, 

not transitive, because (2,6) € Y and (6,9) € J but (2,9) ¢ Z. 


3.173 The relation -7 is 

not antisymmetric, because (0, 1) € Y and (1,0) € Y but 0 F 1, 

not asymmetric, because it does not contain any two pairs (0,1) € % and 
(1,0) € %, 

not connected, because (8, 9) ¢ .Y and (9, 8) ¢ -Y, 

irreflexive, because it does not contain any element on the diagonal, 

not reflexive, because (0,0) ¢ -Y, 

not strongly connected, because (8, 9) ¢ .Y and (9, 8) ¢ -Y, 

symmetric, because if (X, Y) € 7 then (Y,X) ¢ Y, 

not transitive, because (9, 6) € .Y and (6,8) € Y but (9,8) ¢ 7. 


3.175 The relation ¥ is 

not antisymmetric, (0,1) € VY and (1,0) € ¥ but 0 ¥ 1, 

not asymmetric, because it does not contain any two pairs (0,1) ¢ VY and 
(oY, 

not connected, because (0,2) ¢ ¥ and (2,0) ¢ ¥, 

not irreflexive, because it contains at least one element on the diagonal, 

not reflexive, because (0,0) ¢ V, 

not strongly connected, because (0,2) ¢ VY and (2,0) ¢ ¥, 

not symmetric, because (1,2) € V but (2,1) ¢ ¥, 

not transitive, because (0,1) € ¥Y and (1,2) € V but (0,2) ¢ Y. 
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3.177 The empty relation is vacuously a partial order. 


3.179 See the definitions of the “diagonal” (example 3.68) and “irreflexive” 
(definition 3.165). 


3.181 By definition, every strict partial order is irreflexive and transitive. If it 
contained (X, Y) and (Y,X), then it would also contain (X, X) by transitivity, but 
it cannot contain (X, X) by irreflexivity; consequently, it cannot contain both (X, Y) 
and (Y, X), which makes it asymmetric. 


3.183 By definition 3.176, an asymmetric relation cannot contain both (X, Y) 
and (Y,X), so that the hypothesis in the definition (3.173) of “‘anti-symmetric” is 
vacuously False. 


3.185 By the preceding exercise (3.184), every asymmetric relation is also 
irreflexive; consequently, an asymmetric and transitive relation is also irreflexive and 
transitive, which is the definition of a strict partial order. Conversely, the solution 
to exercise 3.181 shows that every strict partial order is asymmetric, whence also 
asymmetric and transitive. 


3.187 The empty relation is vacuously asymmetric and strongly connected. 


3.189 The empty subset of the empty set is vacuously a chain with respect to the 
relation of inclusion. 


Exercises from Chapter 4 


4.1 


5 -| 2, {2}, {oto}. jo1ar.{o10}} @, {2}, {o{o}}. {o10.{o.103}} | 


4.3 For each set, X € {X} by pairing, whence X € (X U {X}) by union. 
Moreover, for each set, X C X, whence X C (X U {X}), also by union. 


4.5 The equation may fail. Let A := @ and B:= { {@} Then A C B but 


AU {A} = BU {2} = {2}, 
BUtB} = {(2}} U| (123}} = | 12). (roy}l. 
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but @ ¢ {@}, { {o}} whence 


(2) £ | (2), (12331. 


so that A U {A} Z BU {B}. 
4.7 The equation may fail. Let A := @ and B:= { {@} 3 Then 


(AU {ANN BUB) = Uta) ({1o}}U} {13} }) =o 
(ana) UtAns} = (en {{a}) Ulan {ior}! = {9}, 


so that (A U {A}) N (BU {B}) 4 (AN B) U {AN B}. 
4.9 This proof proceeds by induction with M. 


Initial Step 

If M = 0, then M = @, whence the hypothesis “K € M and L € M” is False for all 
K,L €N, and hence the implication is True. 

Induction hypothesis 

Assume that there exists NV € N such that the theorem holds for M := N, so that for 
all K,L,e N,ifkK €e NandLeEN, then K ULEN. 

Induction step 


For all K,L, € N, if K € (N U {N}) and L € (N U {N}), then only four cases can 
arise. 

If kK € NandL € N, then (KUL) € N by induction hypothesis; from N C NU{N} 
it then follows that (K U L) € (N U {N}). 
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If K € {N} and L € N, then K = NandL CN, whence K UL=NUL=N, 
and hence (K U L) = N € {N} C (NU {N}); swapping K and L yields a proof for 
KeNandLe {N}. 

If K € {N} and L € {N}, then K = N and L = N, whence K UL = N, and hence 
(KUL) =Ne{N}C (NU {N}). 


4.11 Let H := 1 be as defined by the Axiom of Infinity. 
He ¥ __ yet unproved, 


¢ definition of ¥, 


[H « P(H)| 
A[(@ € H) A (VX{(X € H) = [(X U {X}) € A]})] 


which is True by the definition of H and the Axiom of Infinity. 


4.13 
VX{(X EN) => [(X U {X}) E N]} yet unproved, 


¢ definition of N, 
VX{(X € (| F) = [(XU {X}) € [| F]} 
¢ definition of (), 


WX[{VB[(B € F) => (X € B)}} 
=> (VB{(B € F) = [(X U {X}) € B]})] 


tt 
VX|[VB({[(B € F) => (X € B)]} 
=> (Be F) = [(XU {X}) € B]})] 
ff axiom P2, 


VX{VB[(B € F) 
=> {(X € B) => [(X U {X}) € B}}]} 


which is True by the definition of #. Moreover, @ € N by theorem 4.8. 
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4.15 
Fle F theorem 4.4, 
FIEAS>(Q)FCD theorem 3.48, 
(\FECI Modus Ponens; 
F(SCN) S (SC) F) definition of F, 
SCI transitivity of C; 
F (SCN)A[(W ES) A (VX{(X € S) = [(X U {XP E SH})] hypothesis on S, 
F (BES) A (WX{(X € S) = [(X U {X}) € S}}) — [(P) A (O)] S (QO) and MP, 
F [(@ € S) A (VX{(X € S) => [(X U {X}) € S]}})] > SEF) definition of , 
HSE F M.P. 
F (S€ F) > (()F CS) theorem 3.48, 
(\FCS Modus Ponens; 


(\F=S (\F CSandSC()\F. 
-F}S=N definition of N. 


4.17 Because @ € N it follows that ()N C @ whence (\)N = ©. 


4.19 Let S$ := VU{S, {D}}. This proof shows that S = N, whence V = S \ 
{@, {S}} = N \ {, {OH}. 

For each X € S, either X € {@, {a} or X ¢ {@, {O}}. 

IfXe€ {@, {@}}, then either X = @ or X = {@}. 

In the first case Gf X = @), then @ € S, by pairing and union. Moreover, @ U 
{@} = {@} € V by hypothesis, whence @ U {GW} € S. 

In the second case (if X = {@}), then {W}U{{o}} € V whence {9}U {{o}} eS. 

Otherwise (if X € S\ {2, {@}}), then X € V, whence X U{X} € V by hypothesis, 
and hence X U {X} € S by union. 

Thus, X U {X} € S for each X € S, and @ € S. Consequently, S = N by the 
Principle of Mathematical Induction. 

Therefore, V = S\ {@} = N \ {@}. 


4.21 1+ 1 = 1U {1} by definition of M+ 1 = MU {M}, and 1 U {1} = 2 by 
definition of 2. 


4.23 3+ 1 = 3U {3} by definition of M+ 1 = MU {M}, and 3 U {3} = 4 by 
definition of 4. 


4.25 5+ 1 = 5U {5} by definition of M+ 1 = MU {M}, and 5 U {5} = 6 by 
definition of 6. 


4.27 7+ 1 = 7U {7} by definition of M+ 1 = MU {M}, and 7 U {7} = 8 by 
definition of 8. 


4.29 2+2 = 24+(14+1) = (2+1)+1 by definition of M+ (N+ 1) = (M+N)+1, 
and (2+ 1) +1=3+1 = 4 by the preceding exercises. 
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4.31 4+2 = 4+(14+1) = (44+1)+4+1 by definition of M+ (V+ 1) = (M+N)+1, 
and (4+ 1) +1=5-+1 = 6 by the preceding exercises. 


4.33 6+2 = 6+(1+1) = (6+1)+1 by definition of M+ (N+ 1) = (M+N)-+1, 
and (6+ 1) +1=7+1 = 8 by the preceding exercises. 


4.35 3+3 =3+(24+1) = (3+2)+1 by definition of M+ (N+ 1) = (M+N)+1, 
and (3+ 2) +1=5-+1 = 6 by the preceding exercises. 


4.37 5+3 =5+(24+1) = (5+2)+1 by definition of M+ (N+1) = (M+N)+1, 
and (5+ 2) +1=7+1 = 8 by the preceding exercises. 


4.39 4+4 = 4+(3+1) = (4+3)+4+1 by definition of M+ (N+ 1) = (M+N)+1, 
and (4+ 3) +1=7+1 = 8 by the preceding exercises. 


4.41 2*2=2*«(1+1) = (2*1)+2 by definition of M*(N+1) = (MxN)+M, 
and 2 * 1 = 2 by definition of M « 1 = M, whence (2 * 1) +2 =2+2 = 4 by the 
preceding exercises. 


4.43 4*2=4*(1+1) = (4*1)+4 by definition of M* (N+ 1) = (M*N)+M, 
and 4 « 1 = 4 by definition of M « 1 = M, whence (4* 1) +4=4+4 = 8 by the 
preceding exercises. 


4.45 No, because 1 + (1 * 1) =1+1=2but(1+1)*(14+1)=2*2=4, 
4.47 Let C := N x Nand specify G: C > C by G(K,L) := (K * L,L + 1). Then 
define F : N > C by F(0) := (1, 1) and FU + 1) := G[F()]. Thus, 


FO) = (0,0), 

F(1) = G[F()] = G0, 1) = (1*1,1+ 1) = (1,2), 
F(2) = G[F(1)] = G(, 2) = (1 * 2,2 + 1) = (2,3), 
F(3) = G[F(2)] = G(2,3) = (2* 3,3 + 1) = (6,4), 


FU +1) = GIF()] = Gd,1+1 =(C)* 04+), €4+)41) 
= ((f+1)!, 1+2), 


Thus N! equals the first projection of F(N). 
4.49 No, because (1 + 2)! = 3! = 6 but (1!) + 2!) = 142 =3. 
4.51 2 <5 because 2 € 5: 


{o.{or\e 


| 2.0}, {o.{o3}. {e1.fe1}}. | 2. {2}. {o{o}}. {2.121.{0.12)}' | 


www.pdfgrip.com 


Solutions to Some Odd-Numbered Exercises 363 


4.53 2 <5 and 5 < 7 by previous exercises whence 2 < 7 by transitivity. 


4.55 3 < 4, and 4 < 5, by definition of <, 5 < 7 by previous exercises, whence 
3 < 7 by transitivity. 


4.57 2 < 7 by previous exercises, and 7 < 8 by definition of < whence 2 < 8 by 
transitivity. 
4.59 If 1 < K and1 < L, then by theorem 4.58 there exist J € N* and J € N* with 


K=I+landL=J+1.HenceK*x L=(J+1)*(JV4+1) =/J*J+14+J4+1> 
O+14+1+4+1. 


4.61 If 3 < K and3 < L, then 3 « 3 < K *« L, whence 7 < 9 < K * L; thus K < 4 
and L < 4. The only possibilities with 1 < K < 4and1<L < 4are K € {2,3} 
and L € {2,3}. Yett2*2=447,2*3=3%*2=647,and3*3=9F7. 


4.63 Similarly, K * L = 9 if and only if (K,L) € {(1, 9), (3, 3), (9, D}. 


4.65 If S #4 @ and S C N then S has a smallest element J € S, by the Well-Ordering 
Principle. Consequently, for every J € S it follows that J < J, which means that 
(I = J) V (I < J), whence (J < J) for] 4 J. Thus =(J < J) because only one of 
the three relations (<, =, >) holds. From —=(J € /) for every J € S it follows that 
INS=@. 


4.67 This proof proceeds by contradiction. Negating the conclusion gives 


ACEK) V(KEL)V LED) = (CERI ARK EDA [> ED] 
=(TEK)A(KELA(LED 


If €¢ K)A (K €L)A (LE J) for some /, K, L € N, then / € J by transitivity of € 
on N, but J ¢ J for every J € N. Therefore the negation of the conclusion is False. 


4.69 This proof proceeds by contradiction. 

First, 7 ¢ I for every Je N:F VIX € N) > [-UW € D]}. 

Second, Subfi, gives (N ¢ N) > [H(N € N)], which has the pattern (P) > 
[-(P)]. 

From the tautology {(P) => [-(P)]} => [-(P)] it then follows that —(N € N). 
4.71 Calculate [(2, 4)]~ + [(6, 3)]~ = [(0, 2)]~ + [3,0)]~ = [(0+3,2+0)]- = 
[(3, 2)]= = [C, = - 


4.73 Calculate [(3, 1)]=—-[(5,)]= = [G, Dl=+[2, Dl = [2.0 ]-+[0, <= 
[(2+0,0+ 3)Jx = [(2,3)]= = [00 Dla. 
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4.75 Kunen’s addition of pairs of natural numbers commutes. 
((M,N)J=) + ((P, Dl=) = (M+ N,P + O)l= 

= [(N+ M,Q+ P)]x 

= (((P, Dla) + (M4. N)J=)- 


4.77 Kunen’s addition of pairs of natural numbers is associative. 
((K, LJ= + [(M.N)J=) + [P.D]= = [(K+M,L+N)]= + [(P, = 

[(K + M) + P,(L+N) + Q)]= 

= [((K+(M+P),L+(N+Q))]- 

:= [(K, D2 + [((N+M,0 + P)]- 

= [(K,D]= + (PP, Dl- + (M,N). 


ll ll 


4.79 Kunen’s multiplication of pairs of natural numbers commutes. 

((M, N)]=) * (((P, D]a) = [(M* P+N*Q,M*Q+N * P)| 
[((P*M+Qx*N, O0*M+PxN)| 
[((P*M+Q*N, P*N+Q*M)| 
([(P. Q)]=) * (M,N) =) 


b oP OR 


lo tt il 


4.81 Kunen’s multiplication of pairs of natural numbers is associative. 
[(K, L)]= * ((M,N)]=) * ((P, Q)]=) 
= [(K, = * [((M*xP+N*O,M*Q4+N x P)o 


= [(K *(MxP+Nx*xQ)+L*x(M*xQ+NxP), 
Kx(MxQ4+Nx*P)+L*(MxP+NxQ))|a 


= [(K *P+Lx*Q)*M+(K*Q+L*x P)*N, 
(Kx P+L*Q)*N+(K*Q+LxP)*M)|a 


= [(K*¥P+L*O,K*Q+L*P)lx*[(M,N)]a 


= ([(K, D]= * [(P, Q=) * (M.N)J=. 
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4.83 Kunen’s multiplication of pairs of natural numbers distributes over addition. 
[(K, L)]= * ((M,N)]= + [(P, l=) 
= [(K, L)]~ * [((M+ P,N + Q)]~ 
= [(K * (M+ P)+L*(N+Q),K * (N+ Q)+L* (M+ P)]= 
=[((K*P+L*Q+K*M+L*N,K*Q4+L*eP+K*N+L* Mo 
= [((K*P+L*Q,Kx*Q+Lx*P)l2n+[(K*M+L*N,K*N+L*M)|~ 
= (((K, D]= * [(P, Dl=) + ((K. D= * [MM ]2). 


4.85 Subtraction does not commute: 


[(0, 0)J= — [, = = [(0, D)J= + [(0, D]= 
(0 +0,0+ 1a 


= [(0, l=: 


II 


II 


[(1, 0)J~ — [(0, 0)]~ = [A, 0)]~ + [(, D]- 
= [(1+0,0+0)]~ 
= (0, 0)J= 


where [(0, 1)]~ 4 [(1, 0)]~ becauseeM+Q0=0+041+1=N+P. 


4.87 Kunen’s multiplication of pairs of natural numbers distributes over subtraction. 


X*(Y—Z) =X*[¥ + (-Z)] = (X*Y) + X* (-Z)] = (XK ¥ Y) + [-(X *Z) 
= (X * Y)—(X *Z). 


4.89 Subtraction does not distribute over addition. 


1-(1+1)=1-2=-1, 
Q—-D)+(1-1)=0+0=0. 


4.91 By definition 4.102, (2/3) + (7/5) = [(2 * 5) + 3 * 7)|/(@ * 5) = 31/15. 
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4.93 By theorem 4.110, (7/3) — (2/5) = (7/3) + [(—2)/5] = {(7 * 5) + [3 
(—2)]$/(3 * 5) = 29/15. 

4.95 By definition 4.102, (2/3) * (7/5) = (2 * 7)/(3 * 5) = 14/15. 

4.97 By definition 4.117, (2/3) + (7/5) = (2/3)*(5/7) = (2*5)/(3*7) = 10/21. 
4.99 By exercise 4.97, (2/3) = (7/5) = 10/21. Similarly, (7/5) + (2/3) = (7/5) * 
(3/2) = (7 * 3)/(5 * 2) = 21/10. However, 10/21 4 21/10, because, according to 
the relation in definition 557, (10, 21) ¥ (21, 10) since 10 * 10 A 21 * 21 inN. 
4.101 (2/3) = [(7/5) + C/1)] = (2/3) = {[(7 * 1) +  *5)]/( * D} = (2/3) = 
(12/5) = (2/3) * (5/12) = (2 * 5)/(3 * 12) = 5/18. However, [(2/3) + (7/5)] + 
[(2/3) = C/1)] = (21/10) + (2/3) = [(21 * 3) + (10 * 2)]/(10 * 3) = 83/30, and 
83/30 4 5/18 because 83 * 18 4 30 * 5. 

4.103 (1/1) + [(2/3) + (7/5)] = G1/1) + 10/21) = [0 * 21) + C1 * 10)]/(1 * 
21) = 31/21. However, [(1/1) + (2/3)] + [G/1) + (7/5)] = (5/3) + (12/5) = 
(5/3) * (5/12) = 25/36, and 25/36 4 31/21 because 25 * 21 4 36 x 31. 

4.105 If 0 < (//J) and 0 < (P/Q), then 0 < J * J and0 < Px Q. Also, ([/J) + 
(P/Q) = 1* 0+ J * P)/(VJ * Q), with 


(*xQO+J%«P)*(J* QO) = (I*J)*(Q* Q)+ (J) *(P*Q)>0 


because Q * Q > O and J * J > 0. Hence 0 < [(//J) + (P/Q)]. 

4.107 (P/1)/(Q/R) = (P/1) * (R/Q) = (P * R)/(1 * Q) = (P * R)/Q. 

4.109 If there exists M € N such that K/L = M/1, then N = M. In the alternative, 
by theorem 4.133, for each K/L € Q, there exists N € N with K/L < N/1. 


Consequently, the set S := {N € N: K/L < N/1} is not empty. By the Well- 
Ordering Principle, S contains a smallest element, denoted here by N. 


4.111 By example 4.135 with N := @ it follows that #(@) = @ = 0. 
Alternatively, #(@) = 0 because of the bijection @: @ > @ and @ = 0. 


4.113 By example 3.24, A({@}) = {2, {oy} = 2. By definition 4.140 with N := 
2, it follows that #(2) = 2. 


4115 From #(P[A(S)]) = 2 follows #A(P[A(S))] = 4 whence 
HAP PP 
(S)])]} = 16. 
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Alternatively, #| PD (Pi P|P()| )] = 16 because 


P(AP|P(@)}) 
= P(A{P({2}]}) 


= alfo. 123. {123}. {2 129}}] 


= 16, 


{2}, {a3}, {tox}, {{o, oy}, 
{2, {3}, {2, {oH}, {o, {o, to}} 

(23, {{23}}, {{o3, {o. oH}. {toi}, {2, on}, 
12, {2}, {oH}, {2 {2}. {2, oy}, 


12, {13}, {2, oH}, {fo}, {oH}, (2, wot, 
= {0,1,2,3,4,5,6, 7, 8,9, 10, 11,12, 13,14, 15}. 


4.117 No, not all ordered pairs need to have the same cardinality. If X # Y, 
then (X,Y) = {{X}, {X, y}} with {X} ~ {X,Y}. In contrast, if X = Y, then 
(X,X) = {{X}, {X, X}} — {{X}, {x} — {{x}}. There exist only two functions 
from (X,X) to (X,Y), neither of which is a bijection: F : {X} th {X}, and 
G: {X} } {X,Y}. Consequently, there does not exist any bijection from (X, X) 
to (X, Y), whence (X, X) and (X, Y) do not have the same cardinality. 


4.119 By theorem 4.150 and by associativity of «, #[(A x B) x C] = [#(A x B)] * 
[#(C)] = {[#(A)]*[#(B)}} + F(C)] = [AAD] *{#@B)]*[A(C)]} = (F(A) *[#(BxC)] = 
#[A x (B x C)]. 


4.121 Apply contraposition to theorem 4.146. 


4.123 If A is denumerable, then there is a bijection F : N — A. Consequently, the 
function G N* — A defined by G(N) := F(N + 1) is also a bijection. 

If moreover X ¢ A, then H : {0} > {X} is also a bijection. 

Thus, G U H is a bijection (G U A): ({0} UN) > ({X} UA). Hence A U {X} is 
denumerable. 

Now proceed by induction with the number of elements in B. 


4.125 Use theorem 4.160 and (A U B) = (A \ B) U(B\ A). 
4.127 Apply theorem 4.156 and exercise 4.126. 
4.129 Apply theorems 4.156 and 4.167. 


4.131 Any non-surjective injection F : X — Y induces a non-surjective injection 
G: 2% <> 2” by restriction from Y to X. 


4.133 #(2*) = 2?) > #(X) by induction with #(X) € N. 
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4.135 If A is denumerable, then there exists a bijection F : A — N, which restricts 
to an injection on each subset S$ C A. 


4.137 Apply theorem 4.166 twice. 


4.139 The existence of every function F; relies on the Axiom of Choice. 


Exercises from Chapter 5 


5.1 For each X € F let F(X) be the smallest element of X. 
5.3 The set Z is nonempty but has no smallest element relative to < or <. 


5.5 With M := 2, define [O]) < [I]p. Let [Zw := [lb [K]u := [0b [Llu := [b. 
Then [0]> < [1]2 but [1]2 + [O]2 X [1]}2 + [1]2. 


5.7 For each countable set C there exists an injection F : C > N. Define a relation 
Ron C by XRY if and only if F(X) < F(Y). 


5.9 If W does not contain a last element, 

=A{AZVY[(Y ~ Z) v (Y = Z)]} 
whence 

VZAY{[-(Y « Z)] A (Y 4 Z).} 


In other words, for each Z € W there exists Y € W such that =(Y ~ Z) and Y # Z. 
In particular, Y € Wz. In yet other words, there does not exist any Z € W such that 
W = Wz. Because < totally orders W, however, it follows that Z < Y must hold. 
Let Y consists of all the initial intervals in W. Then U G = W, but there does not 
exist any Z € W such that W = Wz. Consequently, |) ¥ is not an initial interval 
of W. 


5.15 There is a transitive set on which € is not a transitive relation. For example, 
the set 


A:= \2, {B}, {1o3}} 


is transitive, because every element of A is also a subset of A. Nevertheless, the 
relation € is not transitive on A, because @ ¢ {{@ }, even though 


BE{D}, {PW} e {{o}}. 
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Exercises from Chapter 6 


6.1 Each finite set of pairwise disjoint sets F = {Ao,...,An—1} where each 
element A; has exactly N; elements has exactly ear (Nz) choice functions. 


6.3. Theorem 6.6 shows that each finite set of nonempty sets has a choice function. 
The proof of theorem 6.20 shows that each choice function corresponds to a choice 
set. 


6.5 The Choice-Set Principle 6.17 translates into formula (1): 


(Wa [Ae F) > (A#@)} (1) 


=> {as [(s Cc U) A(VA{(A € F) = [AX(SNA = won} 


6.7 For each set -¥ of nonempty sets, define a relation R C ¥ x |) F by R= 
{(A,X): (A € ¥) A (X € A)}. If the Choice-Relation Principle 6.15 holds, then 
there exists a function /' C R with the same domain as that of R. Hence F is a choice 
function. Thus the Choice-Function Principle 6.8 holds. 

Conversely, for each relation R C A x B, and for each set X in the domain of R, 
define the vertical section of R at X by Ry := {Y € B: (X,Y) € R}. In particular, 
Ry 4 @ because X is in the domain of R. If the Choice-Function Principle 6.8 holds, 
then there exists a choice function C: ¥ — |) F for F := {Ry : AY[(X, Y) € R}}. 
Thus C(Rx) € Rx for every X. Hence parametrize the vertical sections of R by 
S: A —> YA(B) with S(X) := Ry, and set F := CoS. Thus F is a function, 
with F(X) = C[S(X)] = C(Rx) € Rx, whence F C R. Thus the Choice-Relation 
Principle 6.15 holds. 


6.9 In every partially ordered set A the empty subset @ C A is a chain. If the 
hypothesis of Zorn’s Maximal-Element Principle 6.32 holds, then A contains an 
upper bound for @. Hence A # @ [88, p. 118]. 


6.11 Zorn’s Maximal-Set Principle 6.34 translates into formula (3): 
ve(Ke 4 B)NAVG {iY € H(F)|A (VAVB{[(A EY)A(BE)] 
= [(AC B)v (BCA)})} > (JY € F)}} (2) 


= {as[(S€ F) A{VA[(A € F) > (SZ ait) 
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6.15 The Multiplicative Principle 6.36 is formula (3): 


VIVININ(S Z BAU: > FA (Willie A) => [MW # B})] 
(3) 
= [Tier 10) # 2}. 


6.17 


(T.A) @ € JF by example 6.50; 

(T.B) if & C ZT is linearly ordered by inclusion, and if U, W € |) a, then there 
exist intervals H, K € & such that U € Hand W € K, withH C K orK CH. 
In the first case, if H C K, then U, W € K. If also V € N and U < V < W, then 
VeK CU; Thus |) & € J; the second case is similar; 

(T.C) if A € 7, then either A = @ and A U {F(A)} = {0} € 7, orA # @ and 
AUF (A)} =A € J, because F(A) = min(A) € A. 


6.19 


(T.A) Be A (E); 
(T.B) if g C A(E) then |) M C Ewhence |) HY € P(E); 
(T.C) ifA € A(E), then A U {F(A)} € A(E), because F(A) € E. 


6.21 For each denumerable element D; of a denumerable family % = {D, k € N} 
there exists a bijection F : N — Dz. Thus the set By of all such bijections is not 
empty. Let @ := {By £ © N}. In the Zermelo-Frenkel-Choice set theory, there 
exists a family choice-function F : N > J & such that Fy := F(€) € Be for each 
£ € N; thus each Fy : N > D, is a bijection. Hence the function Nx N > UB 
with (k, £) +» F¢(k) is a bijection. 


Exercises from Chapter 7 


7.1 There are no dominant pure strategies in The Battle of the Sexes. 
7.3 The two positions where they both go to the same show are Nash equilibria. 


7.5 There are two-person games with a Nash equilibrium but without any dominant 
strategy and hence without dominant strategy equilibrium, for instance, The Battle 
of the Sexes (exercises 7.1 and 7.3). 


7.7 Forallx eA andy eB, 


min{f (x,y): y € B} < f(x,y) < max{f(x,y) : x € A}. 


Consequently, 


min{f(x,y): y € B} < max{f(x, y): x € A}, 
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where the left-hand side depends only on x while the right-hand side depends only 
on y. Therefore, 


max{min{f(x, y): y € B}: x € A} < min{max{f(, y): x € A}: y © Bh. 


7.9 For each strategy S,, available to Al, Al identifies the worst payoff ps, ¢,, under 
every strategy S,, available to Bo: 


‘ Al . ? s 
MiN{P%.,s,, ° Seo € Bo’s strategies}. 


To avoid the worst of the worst, Al plays the strategy S,, that returns the best among 
the worst payoffs: 


payoff to Al = max{min{p%) 5, : Seo € Bo’s strategies} : S,, € Al’s strategies}. 


7.11 Ifa third player Ci has a single weakly dominant strategy, then Ci will play 
that strategy, leaving Al and Bo with a game for two players and any number of pure 
strategies, which need not have any Nash equilibrium. 


7.13 To avoid failure, the Blue commander must play Middle. Thus the Red 
commander knows that Blue plays Middle; to avoid failure, Red must then play 
either Left or Right. In either case, (M,L) or (M,R), both commanders get a Draw. 


7.15 Each of the N —1 girls who rejected him did so because she received and holds 
a better proposal. In particular, the last girl who rejected him received at least two 
proposals. Consequently, there is still at least one girl c who has not yet received 
any proposal. In the next round, b must then propose to c, which terminates the 
algorithm, because all the other girls have received at least one proposal (from b) 
and therefore hold one. 


7.17 Depending on the beggars’ and choosers’ rankings, infinite sets need not admit 
of any total stable relations. Let the beggars and choosers consist of all negative and 
positive integers, indexed by their values: 


B:= Z* = {...,-3,-2,—-1}; CZ, Sh dca} 
bem = —m; Cy f=: 


Because beggar b_,, begs for m, every chooser c ranks all beggars with the same 
well-order b_, - b_, if and only if k < £. In contrast, beggar b_,, ranks choosers 
by swapping 1,...,2m with 2m+1,..., 4m, and otherwise leaving the natural order 
on Z* unchanged: 


2m+1,...,4m,1,...,2m,4m+1,4m+2,... 
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With such rankings, there are no total stable relations. The proof proceeds by 
contraposition, showing that every total relation is unstable. Indeed, if a relation 
M C Bx C is total, then there exists k € Zs with (b_z,c,;) € M, preceded by 
k — 1 >= Ocouples 


(b_-1,Ce,), ia | (by~x, Cey_)s (b_x, C1), Ss (b_n, ce,) ees 


Yet b_; prefers 2k > k — 1 choosers C2441, ..., Cag to ci. Hence there exists n > k 
such that (b_,, ce,) € M with 2k + 1 < €, < 4k, whence (b_n, c¢,) P< (b-x, €1), $0 
that M is unstable. 


7.19 Extend algorithm 7.31 (with a match maker) to stable relations that may relate 
each chooser to more than one beggar, with a quota q, of beggars for chooser c 
(polygamy, polyandry, schools admitting more than one student, etc.). See Gale & 
Shapley’s references [40, 41]. 


7.21 For all (b1,c,), (b2,c2) € Ug there exist stable relations M,,M, € Z with 
(b,,c,) € My, and (b2,c2) € Mp. Yet M; C Mz or M C M, because 7 is a chain. 
Thus (b1,¢,), (b2, C2) € M := max{M,, My} € JZ, but M is stable, so that none of 
the three conditions in definition 7.27 can hold. Therefore Uz is stable. 

7.25 From (X,X) € @°"' if and only if (X,X) € & follows (X,X) ¢ #\ #'. 


7.27 If (X,Y),(Y¥,Z) € Pg = &\ #"' © &, then (X,Z) € &, by transitivity 
of Z. 

If also (X,Z) € #°!, then (Z,X) € Z, but (X, Y) € Z& by hypothesis, whence 
(Z, Y) € & by transitivity of Z, so that (Y,Z) € Z°—', contradicting the hypothesis 
that (Y,Z) ¢ #°"!. 


7.29 Combine the preceding solutions. 

7.31 With at least two voters Val and Vic and at least two alternatives Al and Bo, 

there is a voters’ profile r such that Val prefers Al to Bo while Vic prefers Bo to Al: 
r(Val) = {(Al, Bo)}; r(Vic) = {(Bo, AD}. 


If o is a permutation that swaps Val and Vic, then 


(roa)(Val) = r[o(Val)] = r(Vic) = {(Bo, Al}; 
(roo)(Vic) = rlo(Vic)] = r(Val) = {(Al, Bo)}. 
If Val is a dictator, so that Y(s) = s(Val) for every voters’ profile s, then ¥(r) = 


r(Val) = {(Al, Bo)} while ¥(r oa) = (roo)(Val) = {(Bo, Al)}. Thus ¥(r) 4 
F(roo), so that F is not invariant under permutations. 


7.33 Weak unanimity implies unanimity. 
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